Generalizable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programs

Justin Reese; Hannah Blau; Timothy Bergquist; Johanna J. Loomba; Tiffany Callahan; Bryan Laraway; Corneliu Antonescu; Elena Casiraghi; Ben Coleman; Michael Gargano; Kenneth Wilkins; Luca Cappelletti; Tommaso Fontana; Nariman Ammar; Blessy Antony; T. M. Murali; Guy Karlebach; Julie A. McMurry; Andrew Williams; Richard Moffitt; Jineta Banerjee; Anthony E. Solomonides; Hannah Davis; Kristin Kostka; Giorgio Valentini; David Sahner; Christopher G. Chute; Charisse Madlock-Brown; Melissa A. Haendel; Peter N. Robinson

Este articulo es un Preprint

Los preprints son informes de investigación preliminares que no han sido certificados por revisión por pares. No deben considerarse para guiar la práctica clínica o los comportamientos relacionados con la salud y no deben publicarse en los medios como información establecida.

Los preprints publicados en línea permiten a los autores recibir comentarios rápidamente, y toda la comunidad científica puede evaluar de forma independiente el trabajo y responder adecuadamente. Estos comentarios se publican junto con los preprints para que cualquiera pueda leer y servir como una revisión pospublicación.

Generalizable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programs

Justin Reese; Hannah Blau; Timothy Bergquist; Johanna J. Loomba; Tiffany Callahan; Bryan Laraway; Corneliu Antonescu; Elena Casiraghi; Ben Coleman; Michael Gargano; Kenneth Wilkins; Luca Cappelletti; Tommaso Fontana; Nariman Ammar; Blessy Antony; T. M. Murali; Guy Karlebach; Julie A. McMurry; Andrew Williams; Richard Moffitt; Jineta Banerjee; Anthony E. Solomonides; Hannah Davis; Kristin Kostka; Giorgio Valentini; David Sahner; Christopher G. Chute; Charisse Madlock-Brown; Melissa A. Haendel; Peter N. Robinson.

Afiliación

Justin Reese; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Hannah Blau; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Timothy Bergquist; Sage Bionetworks. Seattle, WA, USA
Johanna J. Loomba; The Integrated Translational Health Research Institute of Virginia (iTHRIV), University of Virginia, Charlottesville, Virginia, USA.
Tiffany Callahan; Department of Biomedical Informatics, Columbia University, New York, NY, USA
Bryan Laraway; University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Corneliu Antonescu; University of Arizona - Banner Health, Phoenix, AZ
Elena Casiraghi; AnacletoLab, Dipartimento di Informatica, Universita degli Studi di Milano, Italy
Ben Coleman; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Michael Gargano; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Kenneth Wilkins; Biostatistics Program, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA.
Luca Cappelletti; AnacletoLab, Dipartimento di Informatica, Universita degli Studi di Milano, Italy
Tommaso Fontana; AnacletoLab, Dipartimento di Informatica, Universita degli Studi di Milano, Italy
Nariman Ammar; University of Tennessee Health Science Center, Memphis, TN, USA
Blessy Antony; Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
T. M. Murali; Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
Guy Karlebach; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Julie A. McMurry; University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Andrew Williams; Tufts Medical Center Clinical and Translational Science Institute, Tufts Medical Center, Boston, MA, USA
Richard Moffitt; Stony Brook University Department of Biomedical Informatics and Stony Brook Cancer Center, Stony Brook, NY, USA
Jineta Banerjee; Sage Bionetworks. Seattle, WA, USA
Anthony E. Solomonides; NorthShore University HealthSystem Research Institute, Evanston, IL
Hannah Davis; Patient-Led Research Collaborative, NY, USA
Kristin Kostka; Northeastern University, OHDSI Center at the Roux Institute, Boston, MA, USA
Giorgio Valentini; AnacletoLab, Dipartimento di Informatica, Universita degli Studi di Milano, Italy
David Sahner; Axle Informatics, Rockville, MD, USA
Christopher G. Chute; Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, MD
Charisse Madlock-Brown; University of Tennessee Health Science Center, Memphis, TN, USA
Melissa A. Haendel; University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Peter N. Robinson; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA

Preprint en En | PREPRINT-MEDRXIV | ID: ppmedrxiv-22275398

ABSTRACT

ABSTRACT

Accurate stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, the natural history of long COVID is incompletely understood and characterized by an extremely wide range of manifestations that are difficult to analyze computationally. In addition, the generalizability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. We present a method for computationally modeling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning procedures. Using k-means clustering of this similarity matrix, we found six distinct clusters of PASC patients, each with distinct profiles of phenotypic abnormalities. There was a significant association of cluster membership with a range of pre-existing conditions and with measures of severity during acute COVID-19. Two of the clusters were associated with severe manifestations and displayed increased mortality. We assigned new patients from other healthcare centers to one of the six clusters on the basis of maximum semantic similarity to the original patients. We show that the identified clusters were generalizable across different hospital systems and that the increased mortality rate was consistently observed in two of the clusters. Semantic phenotypic clustering can provide a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC.

Licencia

cc_by_nc

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google

Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Prognostic_studies Idioma: En Año: 2022 Tipo del documento: Preprint

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google

Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Prognostic_studies Idioma: En Año: 2022 Tipo del documento: Preprint