Your browser doesn't support javascript.
loading
An efficient, not-only-linear correlation coefficient based on clustering.
Pividori, Milton; Ritchie, Marylyn D; Milone, Diego H; Greene, Casey S.
Afiliación
  • Pividori M; Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Electronic address: milton.pividori@cuanschutz.edu.
  • Ritchie MD; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
  • Milone DH; Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), Universidad Nacional del Litoral, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Santa Fe CP3000, Argentina.
  • Greene CS; Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Center for Health AI, University of Colorado School of Medicine, Aurora, CO 80045, USA. Electronic address: casey.s.greene@cuanschutz.edu.
Cell Syst ; 15(9): 854-868.e3, 2024 Sep 18.
Article en En | MEDLINE | ID: mdl-39243756
ABSTRACT
Identifying meaningful patterns in data is crucial for understanding complex biological processes, particularly in transcriptomics, where genes with correlated expression often share functions or contribute to disease mechanisms. Traditional correlation coefficients, which primarily capture linear relationships, may overlook important nonlinear patterns. We introduce the clustermatch correlation coefficient (CCC), a not-only-linear coefficient that utilizes clustering to efficiently detect both linear and nonlinear associations. CCC outperforms standard methods by revealing biologically meaningful patterns that linear-only coefficients miss and is faster than state-of-the-art coefficients such as the maximal information coefficient. When applied to human gene expression data from genotype-tissue expression (GTEx), CCC identified robust linear relationships and nonlinear patterns, such as sex-specific differences, that are undetectable by standard methods. Highly ranked gene pairs were enriched for interactions in integrated networks built from protein-protein interactions, transcription factor regulation, and chemical and genetic perturbations, suggesting that CCC can detect functional relationships missed by linear-only approaches. CCC is a highly efficient, next-generation, not-only-linear correlation coefficient for genome-scale data. A record of this paper's transparent peer review process is included in the supplemental information.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Perfilación de la Expresión Génica Límite: Humans Idioma: En Revista: Cell Syst Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Perfilación de la Expresión Génica Límite: Humans Idioma: En Revista: Cell Syst Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos