Búsqueda | Portal Regional de la BVS

Flexible expressed region analysis for RNA-seq with derfinder.

Collado-Torres, Leonardo; Nellore, Abhinav; Frazee, Alyssa C; Wilks, Christopher; Love, Michael I; Langmead, Ben; Irizarry, Rafael A; Leek, Jeffrey T; Jaffe, Andrew E.

Nucleic Acids Res ; 45(2): e9, 2017 01 25.

Artículo en Inglés | MEDLINE | ID: mdl-27694310

RESUMEN

Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly.We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete.derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.

Asunto(s)

Perfilación de la Expresión Génica/métodos , Programas Informáticos , Regulación de la Expresión Génica , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Transcriptoma , Navegador Web

Polyester: simulating RNA-seq datasets with differential transcript expression.

Frazee, Alyssa C; Jaffe, Andrew E; Langmead, Ben; Leek, Jeffrey T.

Bioinformatics ; 31(17): 2778-84, 2015 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-25926345

RESUMEN

MOTIVATION: Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. RESULTS: Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user. AVAILABILITY AND IMPLEMENTATION: Polyester is freely available from Bioconductor (http://bioconductor.org/). CONTACT: jtleek@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Cromosomas Humanos Par 22/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Algoritmos , Distribución Binomial , Europa (Continente) , Regulación de la Expresión Génica , Genética de Población , Haplotipos/genética , Humanos , Isoformas de Proteínas , ARN/genética

Ballgown bridges the gap between transcriptome assembly and expression analysis.

Frazee, Alyssa C; Pertea, Geo; Jaffe, Andrew E; Langmead, Ben; Salzberg, Steven L; Leek, Jeffrey T.

Nat Biotechnol ; 33(3): 243-6, 2015 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-25748911

Asunto(s)

Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Programas Informáticos , Transcriptoma/genética , Femenino , Humanos , Masculino , Sitios de Carácter Cuantitativo/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo

Differential expression analysis of RNA-seq data at single-base resolution.

Frazee, Alyssa C; Sabunciyan, Sarven; Hansen, Kasper D; Irizarry, Rafael A; Leek, Jeffrey T.

Biostatistics ; 15(3): 413-26, 2014 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-24398039

RESUMEN

RNA-sequencing (RNA-seq) is a flexible technology for measuring genome-wide expression that is rapidly replacing microarrays as costs become comparable. Current differential expression analysis methods for RNA-seq data fall into two broad classes: (1) methods that quantify expression within the boundaries of genes previously published in databases and (2) methods that attempt to reconstruct full length RNA transcripts. The first class cannot discover differential expression outside of previously known genes. While the second approach does possess discovery capabilities, statistical analysis of differential expression is complicated by the ambiguity and variability incurred while assembling transcripts and estimating their abundances. Here, we propose a novel method that first identifies differentially expressed regions (DERs) of interest by assessing differential expression at each base of the genome. The method then segments the genome into regions comprised of bases showing similar differential expression signal, and then assigns a measure of statistical significance to each region. Optionally, DERs can be annotated using a reference database of genomic features. We compare our approach with leading competitors from both current classes of differential expression methods and highlight the strengths and weaknesses of each. A software implementation of our method is available on github (https://github.com/alyssafrazee/derfinder).

Asunto(s)

Perfilación de la Expresión Génica/métodos , Genómica/métodos , Análisis de Secuencia de ARN/métodos , Humanos

ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets.

Frazee, Alyssa C; Langmead, Ben; Leek, Jeffrey T.

BMC Bioinformatics ; 12: 449, 2011 Nov 16.

Artículo en Inglés | MEDLINE | ID: mdl-22087737

RESUMEN

BACKGROUND: RNA sequencing is a flexible and powerful new approach for measuring gene, exon, or isoform expression. To maximize the utility of RNA sequencing data, new statistical methods are needed for clustering, differential expression, and other analyses. A major barrier to the development of new statistical methods is the lack of RNA sequencing datasets that can be easily obtained and analyzed in common statistical software packages such as R. To speed up the development process, we have created a resource of analysis-ready RNA-sequencing datasets. 2 DESCRIPTION: ReCount is an online resource of RNA-seq gene count tables and auxilliary data. Tables were built from raw RNA sequencing data from 18 different published studies comprising 475 samples and over 8 billion reads. Using the Myrna package, reads were aligned, overlapped with gene models and tabulated into gene-by-sample count tables that are ready for statistical analysis. Count tables and phenotype data were combined into Bioconductor ExpressionSet objects for ease of analysis. ReCount also contains the Myrna manifest files and R source code used to process the samples, allowing statistical and computational scientists to consider alternative parameter values. 3 CONCLUSIONS: By combining datasets from many studies and providing data that has already been processed from. fastq format into ready-to-use. RData and. txt files, ReCount facilitates analysis and methods development for RNA-seq count data. We anticipate that ReCount will also be useful for investigators who wish to consider cross-study comparisons and alternative normalization strategies for RNA-seq.

Asunto(s)

Perfilación de la Expresión Génica/métodos , ARN/análisis , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Animales , Humanos , ARN/genética

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA