Búsqueda | Portal Regional de la BVS

An open-source framework for end-to-end analysis of electronic health record data.

Heumos, Lukas; Ehmele, Philipp; Treis, Tim; Upmeier Zu Belzen, Julius; Roellin, Eljas; May, Lilly; Namsaraeva, Altana; Horlava, Nastassya; Shitov, Vladimir A; Zhang, Xinyue; Zappia, Luke; Knoll, Rainer; Lang, Niklas J; Hetzel, Leon; Virshup, Isaac; Sikkema, Lisa; Curion, Fabiola; Eils, Roland; Schiller, Herbert B; Hilgendorff, Anne; Theis, Fabian J.

Nat Med ; 2024 Sep 12.

Artículo en Inglés | MEDLINE | ID: mdl-39266748

RESUMEN

With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a modular open-source Python framework designed for exploratory analysis of heterogeneous epidemiology and EHR data. ehrapy incorporates a series of analytical steps, from data extraction and quality control to the generation of low-dimensional representations. Complemented by rich statistical modules, ehrapy facilitates associating patients with disease states, differential comparison between patient clusters, survival analysis, trajectory inference, causal inference and more. Leveraging ontologies, ehrapy further enables data sharing and training EHR deep learning models, paving the way for foundational models in biomedical research. We demonstrate ehrapy's features in six distinct examples. We applied ehrapy to stratify patients affected by unspecified pneumonia into finer-grained phenotypes. Furthermore, we reveal biomarkers for significant differences in survival among these groups. Additionally, we quantify medication-class effects of pneumonia medications on length of stay. We further leveraged ehrapy to analyze cardiovascular risks across different data modalities. We reconstructed disease state trajectories in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on imaging data. Finally, we conducted a case study to demonstrate how ehrapy can detect and mitigate biases in EHR data. ehrapy, thus, provides a framework that we envision will standardize analysis pipelines on EHR data and serve as a cornerstone for the community.

SIMBSIG: similarity search and clustering for biobank-scale data.

Adamer, Michael F; Roellin, Eljas; Bourguignon, Lucie; Borgwardt, Karsten.

Bioinformatics ; 39(1)2023 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-36610707

RESUMEN

SUMMARY: In many modern bioinformatics applications, such as statistical genetics, or single-cell analysis, one frequently encounters datasets which are orders of magnitude too large for conventional in-memory analysis. To tackle this challenge, we introduce SIMBSIG (SIMmilarity Batched Search Integrated GPU), a highly scalable Python package which provides a scikit-learn-like interface for out-of-core, GPU-enabled similarity searches, principal component analysis and clustering. Due to the PyTorch backend, it is highly modular and particularly tailored to many data types with a particular focus on biobank data analysis. AVAILABILITY AND IMPLEMENTATION: SIMBSIG is freely available from PyPI and its source code and documentation can be found on GitHub (https://github.com/BorgwardtLab/simbsig) under a BSD-3 license.

Asunto(s)

Bancos de Muestras Biológicas , Programas Informáticos , Biología Computacional , Documentación , Análisis por Conglomerados

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA