Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36208175

RESUMEN

Cell-type composition of intact bulk tissues can vary across samples. Deciphering cell-type composition and its changes during disease progression is an important step toward understanding disease pathogenesis. To infer cell-type composition, existing cell-type deconvolution methods for bulk RNA sequencing (RNA-seq) data often require matched single-cell RNA-seq (scRNA-seq) data, generated from samples with similar clinical conditions, as reference. However, due to the difficulty of obtaining scRNA-seq data in diseased samples, only limited scRNA-seq data in matched disease conditions are available. Using scRNA-seq reference to deconvolve bulk RNA-seq data from samples with different disease conditions may lead to a biased estimation of cell-type proportions. To overcome this limitation, we propose an iterative estimation procedure, MuSiC2, which is an extension of MuSiC, to perform deconvolution analysis of bulk RNA-seq data generated from samples with multiple clinical conditions where at least one condition is different from that of the scRNA-seq reference. Extensive benchmark evaluations indicated that MuSiC2 improved the accuracy of cell-type proportion estimates of bulk RNA-seq samples under different conditions as compared with the traditional MuSiC deconvolution. MuSiC2 was applied to two bulk RNA-seq datasets for deconvolution analysis, including one from human pancreatic islets and the other from human retina. We show that MuSiC2 improves current deconvolution methods and provides more accurate cell-type proportion estimates when the bulk and single-cell reference differ in clinical conditions. We believe the condition-specific cell-type composition estimates from MuSiC2 will facilitate the downstream analysis and help identify cellular targets of human diseases.


Asunto(s)
ARN , Análisis de la Célula Individual , Humanos , ARN/genética , RNA-Seq , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma , Análisis de Secuencia de ARN/métodos
2.
Nutrients ; 14(8)2022 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-35458125

RESUMEN

Vitamin A (VA) deficiency and diarrheal diseases are both serious public health issues worldwide. VA deficiency is associated with impaired intestinal barrier function and increased risk of mucosal infection-related mortality. The bioactive form of VA, retinoic acid, is a well-known regulator of mucosal integrity. Using Citrobacter rodentium-infected mice as a model for diarrheal diseases in humans, previous studies showed that VA-deficient (VAD) mice failed to clear C. rodentium as compared to their VA-sufficient (VAS) counterparts. However, the distinct intestinal gene responses that are dependent on the host's VA status still need to be discovered. The mRNAs extracted from the small intestine (SI) and the colon were sequenced and analyzed on three levels: differential gene expression, enrichment, and co-expression. C. rodentium infection interacted differentially with VA status to alter colon gene expression. Novel functional categories downregulated by this pathogen were identified, highlighted by genes related to the metabolism of VA, vitamin D, and ion transport, including improper upregulation of Cl- secretion and disrupted HCO3- metabolism. Our results suggest that derangement of micronutrient metabolism and ion transport, together with the compromised immune responses in VAD hosts, may be responsible for the higher mortality to C. rodentium under conditions of inadequate VA.


Asunto(s)
Infecciones por Enterobacteriaceae , Deficiencia de Vitamina A , Animales , Citrobacter rodentium , Colon/metabolismo , Diarrea/complicaciones , Mucosa Intestinal/metabolismo , Intestino Delgado/metabolismo , Ratones , Ratones Endogámicos C57BL , Vitamina A/metabolismo , Deficiencia de Vitamina A/complicaciones
3.
Sci Rep ; 11(1): 15612, 2021 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-34341398

RESUMEN

Age-related macular degeneration (AMD) is a blinding eye disease with no unifying theme for its etiology. We used single-cell RNA sequencing to analyze the transcriptomes of ~ 93,000 cells from the macula and peripheral retina from two adult human donors and bulk RNA sequencing from fifteen adult human donors with and without AMD. Analysis of our single-cell data identified 267 cell-type-specific genes. Comparison of macula and peripheral retinal regions found no cell-type differences but did identify 50 differentially expressed genes (DEGs) with about 1/3 expressed in cones. Integration of our single-cell data with bulk RNA sequencing data from normal and AMD donors showed compositional changes more pronounced in macula in rods, microglia, endothelium, Müller glia, and astrocytes in the transition from normal to advanced AMD. KEGG pathway analysis of our normal vs. advanced AMD eyes identified enrichment in complement and coagulation pathways, antigen presentation, tissue remodeling, and signaling pathways including PI3K-Akt, NOD-like, Toll-like, and Rap1. These results showcase the use of single-cell RNA sequencing to infer cell-type compositional and cell-type-specific gene expression changes in intact bulk tissue and provide a foundation for investigating molecular mechanisms of retinal disease that lead to new therapeutic targets.


Asunto(s)
Degeneración Macular , Fosfatidilinositol 3-Quinasas , RNA-Seq , Retina , Perfilación de la Expresión Génica , Humanos , Análisis de Secuencia de ARN
4.
J Nutr Biochem ; 98: 108814, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34242724

RESUMEN

Vitamin A (VA) deficiency remains prevalent in resource limited areas. Using Citrobacter rodentium infection in mice as a model for diarrheal diseases, previous reports showed reduced pathogen clearance and survival due to vitamin A deficient (VAD) status. To characterize the impact of preexisting VA deficiency on gene expression patterns in the intestines, and to discover novel target genes in VA-related biological pathways, VA deficiency in mice were induced by diet. Total mRNAs were extracted from small intestine (SI) and colon, and sequenced. Differentially Expressed Gene (DEG), Gene Ontology (GO) enrichment, and co-expression network analyses were performed. DEGs compared between VAS and VAD groups detected 49 SI and 94 colon genes. By GO information, SI DEGs were significantly enriched in categories relevant to retinoid metabolic process, molecule binding, and immune function. Three co-expression modules showed significant correlation with VA status in SI; these modules contained four known retinoic acid targets. In addition, other SI genes of interest (e.g., Mbl2, Cxcl14, and Nr0b2) in these modules were suggested as new candidate genes regulated by VA. Furthermore, our analysis showed that markers of two cell types in SI, mast cells and Tuft cells, were significantly altered by VA status. In colon, "cell division" was the only enriched category and was negatively associated with VA. Thus, these data suggested that SI and colon have distinct networks under the regulation of dietary VA, and that preexisting VA deficiency could have a significant impact on the host response to a variety of disease conditions.


Asunto(s)
Colon/metabolismo , Intestino Delgado/metabolismo , RNA-Seq/métodos , Deficiencia de Vitamina A/genética , Animales , Citrobacter rodentium , Infecciones por Enterobacteriaceae/genética , Infecciones por Enterobacteriaceae/microbiología , Perfilación de la Expresión Génica/métodos , Ontología de Genes , Ratones , Ratones Endogámicos C57BL , ARN Mensajero/genética , Transcriptoma , Tretinoina/metabolismo , Vitamina A/genética , Vitamina A/metabolismo
5.
Nat Commun ; 11(1): 2338, 2020 05 11.
Artículo en Inglés | MEDLINE | ID: mdl-32393754

RESUMEN

Single-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells and batch effect impose computational challenges. We present DESC, an unsupervised deep embedding algorithm that clusters scRNA-seq data by iteratively optimizing a clustering objective function. Through iterative self-learning, DESC gradually removes batch effects, as long as technical differences across batches are smaller than true biological variations. As a soft clustering algorithm, cluster assignment probabilities from DESC are biologically interpretable and can reveal both discrete and pseudotemporal structure of cells. Comprehensive evaluations show that DESC offers a proper balance of clustering accuracy and stability, has a small footprint on memory, does not explicitly require batch information for batch effect removal, and can utilize GPU when available. As the scale of single-cell studies continues to grow, we believe DESC will offer a valuable tool for biomedical researchers to disentangle complex cellular heterogeneity.


Asunto(s)
Análisis por Conglomerados , Aprendizaje Profundo , RNA-Seq , Análisis de la Célula Individual , Algoritmos , Animales , Médula Ósea/metabolismo , Regulación de la Expresión Génica , Humanos , Islotes Pancreáticos/metabolismo , Leucocitos Mononucleares/metabolismo , Macaca , Ratones , Monocitos/metabolismo , Retina/metabolismo
6.
Nat Mach Intell ; 2(10): 607-618, 2020 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-33817554

RESUMEN

Clustering and cell type classification are important steps in single-cell RNA-seq (scRNA-seq) analysis. As more and more scRNA-seq data are becoming available, supervised cell type classification methods that utilize external well-annotated source data start to gain popularity over unsupervised clustering algorithms. However, the performance of existing supervised methods is highly dependent on source data quality, and they often have limited accuracy to classify cell types that are missing in the source data. To overcome these limitations, we developed ItClust, a transfer learning algorithm that borrows idea from supervised cell type classification algorithms, but also leverages information in target data to ensure sensitivity in classifying cells that are only present in the target data. Through extensive evaluations using data from different species and tissues generated with diverse scRNA-seq protocols, we show that ItClust significantly improves clustering and cell type classification accuracy over popular unsupervised clustering and supervised cell type classification algorithms.

7.
PLoS Comput Biol ; 14(9): e1006436, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-30240439

RESUMEN

Co-expression network analysis provides useful information for studying gene regulation in biological processes. Examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. One challenge in this type of analysis is that the sample sizes in each condition are usually small, making the statistical inference of co-expression patterns highly underpowered. A joint network construction that borrows information from related structures across conditions has the potential to improve the power of the analysis. One possible approach to constructing the co-expression network is to use the Gaussian graphical model. Though several methods are available for joint estimation of multiple graphical models, they do not fully account for the heterogeneity between samples and between co-expression patterns introduced by condition specificity. Here we develop the condition-adaptive fused graphical lasso (CFGL), a data-driven approach to incorporate condition specificity in the estimation of co-expression networks. We show that this method improves the accuracy with which networks are learned. The application of this method on a rat multi-tissue dataset and The Cancer Genome Atlas (TCGA) breast cancer dataset provides interesting biological insights. In both analyses, we identify numerous modules enriched for Gene Ontology functions and observe that the modules that are upregulated in a particular condition are often involved in condition-specific activities. Interestingly, we observe that the genes strongly associated with survival time in the TCGA dataset are less likely to be network hubs, suggesting that genes associated with cancer progression are likely to govern specific functions or execute final biological functions in pathways, rather than regulating a large number of biological processes. Additionally, we observed that the tumor-specific hub genes tend to have few shared edges with normal tissue, revealing tumor-specific regulatory mechanism.


Asunto(s)
Encéfalo/metabolismo , Neoplasias de la Mama/metabolismo , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Miocardio/metabolismo , Algoritmos , Animales , Área Bajo la Curva , Neoplasias de la Mama/genética , Gráficos por Computador , Simulación por Computador , Bases de Datos Factuales , Femenino , Corazón , Humanos , Masculino , Neoplasias/metabolismo , Distribución Normal , Ratas , Programas Informáticos
8.
J Am Stat Assoc ; 113(523): 1028-1039, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-31249430

RESUMEN

The identification of reproducible signals from the results of replicate high-throughput experiments is an important part of modern biological research. Often little is known about the dependence structure and the marginal distribution of the data, motivating the development of a nonparametric approach to assess reproducibility. The procedure, which we call the maximum rank reproducibility (MaRR) procedure, uses a maximum rank statistic to parse reproducible signals from noise without making assumptions about the distribution of reproducible signals. Because it uses the rank scale this procedure can be easily applied to a variety of data types. One application is to assess the reproducibility of RNA-seq technology using data produced by the sequencing quality control (SEQC) consortium, which coordinated a multi-laboratory effort to assess reproducibility across three RNA-seq platforms. Our results on simulations and SEQC data show that the MaRR procedure effectively controls false discovery rates, has desirable power properties, and compares well to existing methods. Supplementary materials for this article are available online.

9.
BMC Bioinformatics ; 17 Suppl 1: 5, 2016 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-26818110

RESUMEN

BACKGROUND: Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies. Integration data across these two platforms has the potential to improve the power and reliability of DEG detection. METHODS: We propose a rank-based semi-parametric model to determine DEGs using information across different sources and apply it to the integration of RNA-seq and microarray data. By incorporating both the significance of differential expression and the consistency across platforms, our method effectively detects DEGs with moderate but consistent signals. We demonstrate the effectiveness of our method using simulation studies, MAQC/SEQC data and a synthetic microRNA dataset. CONCLUSIONS: Our integration method is not only robust to noise and heterogeneity in the data, but also adaptive to the structure of data. In our simulations and real data studies, our approach shows a higher discriminate power and identifies more biologically relevant DEGs than eBayes, DEseq and some commonly used meta-analysis methods.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , ARN/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma , Perfilación de la Expresión Génica/métodos , Humanos , Reproducibilidad de los Resultados
10.
Brief Bioinform ; 16(1): 32-8, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24177380

RESUMEN

As a group of important plant species in agriculture and biology, polyploids have been increasingly studied in terms of their genome structure and organization. There are two types of polyploids, allopolyploids and autopolyploids, each resulting from a different genetic origin, which undergo meiotic divisions of a distinct complexity. A set of statistical models has been developed for linkage analysis, respectively for each type, by taking into account their unique meiotic behavior, i.e. preferential pairing for allopolyploids and double reduction for autopolyploids. We synthesized these models and modified them to accommodate the linkage analysis of less informative dominant markers. By reanalysing a published data set of varying ploidy in Arabidopsis, we corrected the estimates of the meiotic recombination frequency aimed to study the significance of polyploidization.


Asunto(s)
Arabidopsis/genética , Ligamiento Genético , Modelos Genéticos , Tetraploidía , Mapeo Cromosómico , Genes de Plantas , Recombinación Genética
11.
Brief Bioinform ; 16(1): 24-31, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24335788

RESUMEN

As an important mechanism for adaptation to heterogeneous environment, plastic responses of correlated traits to environmental alteration may also be genetically correlated, but less is known about the underlying genetic basis. We describe a statistical model for mapping specific quantitative trait loci (QTLs) that control the interrelationship of phenotypic plasticity between different traits. The model is constructed by a bivariate mixture setting, implemented with the EM algorithm to estimate the genetic effects of QTLs on correlative plastic response. We provide a series of procedure that test (1) how a QTL controls the phenotypic plasticity of a single trait; and (2) how the QTL determines the correlation of environment-induced changes of different traits. The model is readily extended to test how epistatic interactions among QTLs play a part in the correlations of different plastic traits. The model was validated through computer simulation and used to analyse multi-environment data of genetic mapping in winter wheat, showing its utilization in practice.


Asunto(s)
Modelos Estadísticos , Sitios de Carácter Cuantitativo/genética , Mapeo Cromosómico , Interacción Gen-Ambiente , Genes de Plantas , Fenotipo , Triticum/genética
12.
Brief Bioinform ; 15(6): 1044-56, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24177379

RESUMEN

As a group of economically important species, linkage mapping of polysomic autotetraploids, including potato, sugarcane and rose, is difficult to conduct due to their unique meiotic property of double reduction that allows sister chromatids to enter into the same gamete. We describe and assess a statistical model for mapping quantitative trait loci (QTLs) in polysomic autotetraploids. The model incorporates double reduction, built in the mixture model-based framework and implemented with the expectation-maximization algorithm. It allows the simultaneous estimation of QTL positions, QTL effects and the degree of double reduction as well as the assessment of the estimation precision of these parameters. We performed computer simulation to examine the statistical properties of the method and validate its use through analyzing real data in tetraploid switchgrass.


Asunto(s)
Mapeo Cromosómico/estadística & datos numéricos , Modelos Genéticos , Sitios de Carácter Cuantitativo , Tetraploidía , Algoritmos , Biología Computacional , Simulación por Computador , Funciones de Verosimilitud , Modelos Estadísticos , Método de Montecarlo , Panicum/genética , Plantas/genética , Polirribosomas/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA