Unravelling the hidden heterogeneities of diffuse large B-cell lymphoma based on coupled two-way clustering.

Zhang, Wei; Li, Li; Li, Xia; Jiang, Wei; Huo, Jianmin; Wang, Yadong; Lin, Meihua; Rao, Shaoqi

Zhang, Wei; Li, Li; Li, Xia; Jiang, Wei; Huo, Jianmin; Wang, Yadong; Lin, Meihua; Rao, Shaoqi.

Afiliación

Zhang W; The First Clinical College, Department of Bioinformatics, and the Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University, Harbin 150086, China. weipoza@163.com

BMC Genomics ; 8: 332, 2007 Sep 22.

Article en En | MEDLINE | ID: mdl-17888167

RESUMEN

BACKGROUND: It becomes increasingly clear that our current taxonomy of clinical phenotypes is mixed with molecular heterogeneity. Of vital importance for refined clinical practice and improved intervention strategies is to define the hidden molecular distinct diseases using modern large-scale genomic approaches. Microarray omics technology has provided a powerful way to dissect hidden genetic heterogeneity of complex diseases. The aim of this study was thus to develop a bioinformatics approach to seek the transcriptional features leading to the hidden subtyping of a complex clinical phenotype. The basic strategy of the proposed method was to iteratively partition in two ways sample and feature space with super-paramagnetic clustering technique and to seek for hard and robust gene clusters that lead to a natural partition of disease samples and that have the highest functionally conceptual consensus evaluated with Gene Ontology. RESULTS: We applied the proposed method to two publicly available microarray datasets of diffuse large B-cell lymphoma (DLBCL), a notoriously heterogeneous phenotype. A feature subset of 30 genes (38 probes) derived from analysis of the first dataset consisting of 4026 genes and 42 DLBCL samples identified three categories of patients with very different five-year overall survival rates (70.59%, 44.44% and 14.29% respectively; p = 0.0017). Analysis of the second dataset consisting of 7129 genes and 58 DLBCL samples revealed a feature subset of 13 genes (16 probes) that not only replicated the findings of the important DLBCL genes (e.g. JAW1 and BCL7A), but also identified three clinically similar subtypes (with 5-year overall survival rates of 63.13%, 34.92% and 15.38% respectively; p = 0.0009) to those identified in the first dataset. Finally, we built a multivariate Cox proportional-hazards prediction model for each feature subset and defined JAW1 as one of the most significant predictor (p = 0.005 and 0.014; hazard ratios = 0.02 and 0.03, respectively for two datasets) for both DLBCL cohorts under study. CONCLUSION: Our results showed that the proposed algorithm is a promising computational strategy for peeling off the hidden genetic heterogeneity based on transcriptionally profiling disease samples, which may lead to an improved diagnosis and treatment of cancers.

Asunto(s)

Heterogeneidad Genética; Linfoma de Células B Grandes Difuso/genética; Análisis por Conglomerados; Perfilación de la Expresión Génica; Humanos; Análisis de Secuencia por Matrices de Oligonucleótidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Linfoma de Células B Grandes Difuso / Heterogeneidad Genética Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: BMC Genomics Asunto de la revista: GENETICA Año: 2007 Tipo del documento: Article País de afiliación: China Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google