Búsqueda | Portal Regional de la BVS

A conditional multi-label model to improve prediction of a rare outcome: An illustration predicting autism diagnosis.

Huang, Wei A; Engelhard, Matthew; Coffman, Marika; Hill, Elliot D; Weng, Qin; Scheer, Abby; Maslow, Gary; Henao, Ricardo; Dawson, Geraldine; Goldstein, Benjamin A.

J Biomed Inform ; 157: 104711, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-39182632

RESUMEN

OBJECTIVE: This study aimed to develop a novel approach using routinely collected electronic health records (EHRs) data to improve the prediction of a rare event. We illustrated this using an example of improving early prediction of an autism diagnosis, given its low prevalence, by leveraging correlations between autism and other neurodevelopmental conditions (NDCs). METHODS: To achieve this, we introduced a conditional multi-label model by merging conditional learning and multi-label methodologies. The conditional learning approach breaks a hard task into more manageable pieces in each stage, and the multi-label approach utilizes information from related neurodevelopmental conditions to learn predictive latent features. The study involved forecasting autism diagnosis by age 5.5 years, utilizing data from the first 18 months of life, and the analysis of feature importance correlations to explore the alignment within the feature space across different conditions. RESULTS: Upon analysis of health records from 18,156 children, we are able to generate a model that predicts a future autism diagnosis with moderate performance (AUROC=0.76). The proposed conditional multi-label method significantly improves predictive performance with an AUROC of 0.80 (p < 0.001). Further examination shows that both the conditional and multi-label approach alone provided marginal lift to the model performance compared to a one-stage one-label approach. We also demonstrated the generalizability and applicability of this method using simulated data with high correlation between feature vectors for different labels. CONCLUSION: Our findings underscore the effectiveness of the developed conditional multi-label model for early prediction of an autism diagnosis. The study introduces a versatile strategy applicable to prediction tasks involving limited target populations but sharing underlying features or etiology among related groups.

Asunto(s)

Trastorno Autístico , Registros Electrónicos de Salud , Humanos , Trastorno Autístico/diagnóstico , Preescolar , Lactante , Masculino , Femenino , Niño , Algoritmos

ct2vl: A Robust Public Resource for Converting SARS-CoV-2 Ct Values to Viral Loads.

Hill, Elliot D; Yilmaz, Fazilet; Callahan, Cody; Morgan, Alex; Cheng, Annie; Braun, Jasper; Arnaout, Ramy.

Viruses ; 16(7)2024 Jun 30.

Artículo en Inglés | MEDLINE | ID: mdl-39066220

RESUMEN

The amount of SARS-CoV-2 in a sample is often measured using Ct values. However, the same Ct value may correspond to different viral loads on different platforms and assays, making them difficult to compare from study to study. To address this problem, we developed ct2vl, a Python package that converts Ct values to viral loads for any RT-qPCR assay/platform. The method is novel in that it is based on determining the maximum PCR replication efficiency, as opposed to fitting a sigmoid (S-shaped) curve relating signal to cycle number. We calibrated ct2vl on two FDA-approved platforms and validated its performance using reference-standard material, including sensitivity analysis. We found that ct2vl-predicted viral loads were highly accurate across five orders of magnitude, with 1.6-fold median error (for comparison, viral loads in clinical samples vary over 10 orders of magnitude). The package has 100% test coverage. We describe installation and usage both from the Unix command-line and from interactive Python environments. ct2vl is freely available via the Python Package Index (PyPI). It facilitates conversion of Ct values to viral loads for clinical investigators, basic researchers, and test developers for any RT-qPCR platform. It thus facilitates comparison among the many quantitative studies of SARS-CoV-2 by helping render observations in a natural, universal unit of measure.

Asunto(s)

COVID-19 , SARS-CoV-2 , Carga Viral , Humanos , SARS-CoV-2/genética , COVID-19/virología , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Programas Informáticos , Prueba de Ácido Nucleico para COVID-19/métodos , Sensibilidad y Especificidad

greylock: A Python Package for Measuring The Composition of Complex Datasets.

Nguyen, Phuc; Arora, Rohit; Hill, Elliot D; Braun, Jasper; Morgan, Alexandra; Quintana, Liza M; Mazzoni, Gabrielle; Lee, Ghee Rye; Arnaout, Rima; Arnaout, Ramy.

ArXiv ; 2023 Dec 29.

Artículo en Inglés | MEDLINE | ID: mdl-39070042

RESUMEN

Machine-learning datasets are typically characterized by measuring their size and class balance. However, there exists a richer and potentially more useful set of measures, termed diversity measures, that incorporate elements' frequencies and between-element similarities. Although these have been available in the R and Julia programming languages for other applications, they have not been as readily available in Python, which is widely used for machine learning, and are not easily applied to machine-learning-sized datasets without special coding considerations. To address these issues, we developed greylock, a Python package that calculates diversity measures and is tailored to large datasets. greylock can calculate any of the frequency-sensitive measures of Hill's D-number framework, and going beyond Hill, their similarity-sensitive counterparts (Greylock is a mountain). greylock also outputs measures that compare datasets (beta diversities). We first briefly review the D-number framework, illustrating how it incorporates elements' frequencies and between-element similarities. We then describe greylock's key features and usage. We end with several examples - immunomics, metagenomics, computational pathology, and medical imaging - illustrating greylock's applicability across a range of dataset types and fields.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA