cellsig plug-in enhances CIBERSORTx signature selection for multidataset transcriptomes with sparse multilevel modelling.
Bioinformatics
; 39(12)2023 12 01.
Article
en En
| MEDLINE
| ID: mdl-37952182
MOTIVATION: The precise characterization of cell-type transcriptomes is pivotal to understanding cellular lineages, deconvolution of bulk transcriptomes, and clinical applications. Single-cell RNA sequencing resources like the Human Cell Atlas have revolutionised cell-type profiling. However, challenges persist due to data heterogeneity and discrepancies across different studies. One limitation of prevailing tools such as CIBERSORTx is their inability to address hierarchical data structures and handle nonoverlapping gene sets across samples, relying on filtering or imputation. RESULTS: Here, we present cellsig, a Bayesian sparse multilevel model designed to improve signature estimation by adjusting data for multilevel effects and modelling for gene-set sparsity. Our model is tailored to large-scale, heterogeneous pseudobulk and bulk RNA sequencing data collections with nonoverlapping gene sets. We tested the performances of cellsig on a novel curated Human Bulk Cell-type Catalogue, which harmonizes 1435 samples across 58 datasets. We show that cellsig significantly enhances cell-type marker gene ranking performance. This approach is valuable for cell-type signature selection, with implications for marker gene validation, single-cell annotation, and deconvolution benchmarks. AVAILABILITY AND IMPLEMENTATION: Codes and the interactive app are available at https://github.com/stemangiola/cellsig; and the database is available at https://doi.org/10.5281/zenodo.7582421.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Perfilación de la Expresión Génica
/
Transcriptoma
Límite:
Humans
Idioma:
En
Revista:
Bioinformatics
Asunto de la revista:
INFORMATICA MEDICA
Año:
2023
Tipo del documento:
Article
País de afiliación:
Australia
Pais de publicación:
Reino Unido