Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Mol Neurosci ; 74(2): 43, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38619646

RESUMEN

Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disorder. Its etiology may be associated with genetic, environmental, and lifestyle factors. With the advancement of technology, the integration of genomics, transcriptomics, and imaging data related to AD allows simultaneous exploration of molecular information at different levels and their interaction within the organism. This paper proposes a hypergraph-regularized joint deep semi-non-negative matrix factorization (HR-JDSNMF) algorithm to integrate positron emission tomography (PET), single-nucleotide polymorphism (SNP), and gene expression data for AD. The method employs matrix factorization techniques to nonlinearly decompose the original data at multiple layers, extracting deep features from different omics data, and utilizes hypergraph mining to uncover high-order correlations among the three types of data. Experimental results demonstrate that this approach outperforms several matrix factorization-based algorithms and effectively identifies multi-omics biomarkers for AD. Additionally, single-cell RNA sequencing (scRNA-seq) data for AD were collected, and genes within significant modules were used to categorize different types of cell clusters into high and low-risk cell groups. Finally, the study extensively explores the differences in differentiation and communication between these two cell types. The multi-omics biomarkers unearthed in this study can serve as valuable references for the clinical diagnosis and drug target discovery for AD. The realization of the algorithm in this paper code is available at https://github.com/ShubingKong/HR-JDSNMF .


Asunto(s)
Enfermedad de Alzheimer , Humanos , Enfermedad de Alzheimer/genética , Multiómica , Algoritmos , Biomarcadores , Diferenciación Celular
2.
BMC Bioinformatics ; 24(1): 275, 2023 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-37403016

RESUMEN

BACKGROUND: P4 medicine (predict, prevent, personalize, and participate) is a new approach to diagnosing and predicting diseases on a patient-by-patient basis. For the prevention and treatment of diseases, prediction plays a fundamental role. One of the intelligent strategies is the design of deep learning models that can predict the state of the disease using gene expression data. RESULTS: We create an autoencoder deep learning model called DeeP4med, including a Classifier and a Transferor that predicts cancer's gene expression (mRNA) matrix from its matched normal sample and vice versa. The range of the F1 score of the model, depending on tissue type in the Classifier, is from 0.935 to 0.999 and in Transferor from 0.944 to 0.999. The accuracy of DeeP4med for tissue and disease classification was 0.986 and 0.992, respectively, which performed better compared to seven classic machine learning models (Support Vector Classifier, Logistic Regression, Linear Discriminant Analysis, Naive Bayes, Decision Tree, Random Forest, K Nearest Neighbors). CONCLUSIONS: Based on the idea of DeeP4med, by having the gene expression matrix of a normal tissue, we can predict its tumor gene expression matrix and, in this way, find effective genes in transforming a normal tissue into a tumor tissue. Results of Differentially Expressed Genes (DEGs) and enrichment analysis on the predicted matrices for 13 types of cancer showed a good correlation with the literature and biological databases. This led that by using the gene expression matrix, to train the model with features of each person in a normal and cancer state, this model could predict diagnosis based on gene expression data from healthy tissue and be used to identify possible therapeutic interventions for those patients.


Asunto(s)
Aprendizaje Profundo , Neoplasias , Humanos , Transcriptoma , Teorema de Bayes , Neoplasias/genética , Aprendizaje Automático
3.
BMC Bioinformatics ; 23(1): 156, 2022 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-35501696

RESUMEN

BACKGROUND: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures. RESULTS: GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage. CONCLUSIONS: Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , RNA-Seq , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos
4.
Methods Mol Biol ; 2048: 155-205, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31396939

RESUMEN

Single-cell RNA-seq (scRNA-seq) has provided novel routes to investigate the heterogeneous populations of T cells and is rapidly becoming a common tool for molecular profiling and identification of novel subsets and functions. This chapter offers an experimental and computational workflow for scRNA-seq analysis of T cells. We focus on the analyses of scRNA-seq data derived from plate-based sorted T cells using flow cytometry and full-length transcriptome protocols such as Smart-Seq2. However, the proposed pipeline can be applied to other high-throughput approaches such as UMI-based methods. We describe a detailed bioinformatics pipeline that can be easily reproduced and discuss future directions and current limitations of these methods in the context of T cell biology.


Asunto(s)
Biología Computacional/métodos , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Linfocitos T/metabolismo , Animales , Análisis por Conglomerados , Citometría de Flujo/instrumentación , Citometría de Flujo/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/instrumentación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Ratones , RNA-Seq/instrumentación , Análisis de la Célula Individual/instrumentación , Programas Informáticos , Transcriptoma , Flujo de Trabajo
5.
Artif Intell Med ; 70: 1-11, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27431033

RESUMEN

OBJECTIVE: High-throughput technologies have generated an unprecedented amount of high-dimensional gene expression data. Algorithmic approaches could be extremely useful to distill information and derive compact interpretable representations of the statistical patterns present in the data. This paper proposes a mining approach to extract an informative representation of gene expression profiles based on a generative model called the Counting Grid (CG). METHOD: Using the CG model, gene expression values are arranged on a discrete grid, learned in a way that "similar" co-expression patterns are arranged in close proximity, thus resulting in an intuitive visualization of the dataset. More than this, the model permits to identify the genes that distinguish between classes (e.g. different types of cancer). Finally, each sample can be characterized with a discriminative signature - extracted from the model - that can be effectively employed for classification. RESULTS: A thorough evaluation on several gene expression datasets demonstrate the suitability of the proposed approach from a twofold perspective: numerically, we reached state-of-the-art classification accuracies on 5 datasets out of 7, and similar results when the approach is tested in a gene selection setting (with a stability always above 0.87); clinically, by confirming that many of the genes highlighted by the model as significant play also a key role for cancer biology. CONCLUSION: The proposed framework can be successfully exploited to meaningfully visualize the samples; detect medically relevant genes; properly classify samples.


Asunto(s)
Algoritmos , Minería de Datos , Perfilación de la Expresión Génica , Análisis por Conglomerados , Genes Relacionados con las Neoplasias , Humanos , Neoplasias/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA