Búsqueda | Portal Regional de la BVS

Consensus of All Solutions for Intractable Phylogenetic Tree Inference.

Tabaszewski, P; Gorecki, P; Markin, A; Anderson, T; Eulenstein, O.

IEEE/ACM Trans Comput Biol Bioinform ; 18(1): 149-161, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-31613775

RESUMEN

Solving median tree problems is a classic approach for inferring species trees from a collection of discordant gene trees. Median tree problems are typically NP-hard and dealt with by local search heuristics. Unfortunately, such heuristics generally lack provable correctness and precision. Algorithmic advances addressing this uncertainty have led to exact dynamic programming formulations suitable to solve a well-studied group of median tree problems for smaller phylogenetic analyses. However, these formulations allow computing only very few optimal species trees out of possibly many such trees, and phylogenetic studies often require the analysis of all optimal solutions through their consensus tree. Here, we describe a significant algorithmic modification of the dynamic programming formulations that compute the cluster counts of all optimal species trees from which various types of consensus trees can be efficiently computed. Through experimental studies, we demonstrate that our parallel implementation of the modified dynamic programming formulation is more efficient than a previous implementation of the original formulation. Finally, we show that the parallel implementation can rapidly identify novel reassorted influenza A viruses potentially facilitating pandemic preparedness efforts.

Asunto(s)

Biología Computacional/métodos , Modelos Genéticos , Filogenia , Animales , Consenso , Genes/genética , Genoma Viral/genética , Heurística , Humanos , Virus de la Influenza A/genética , Infecciones por Orthomyxoviridae/virología , Porcinos

Locating large-scale gene duplication events through reconciled trees: implications for identifying ancient polyploidy events in plants.

Burleigh, J G; Bansal, M S; Wehe, A; Eulenstein, O.

J Comput Biol ; 16(8): 1071-83, 2009 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-19689214

RESUMEN

Recent analyses of plant genomic data have found extensive evidence of ancient whole genome duplication (or polyploidy) events, but there are many unresolved questions regarding the number and timing of such events in plant evolutionary history. We describe the first exact and efficient algorithm for the Episode Clustering problem, which, given a collection of rooted gene trees and a rooted species tree, seeks the minimum number of locations on the species tree of gene duplication events. Solving this problem allows one to place gene duplication events onto nodes of a given species tree and potentially detect large-scale gene duplication events. We examined the performance of an implementation of our algorithm using 85 plant gene trees that contain genes from a total of 136 plant taxa. We found evidence of large-scale gene duplication events in Populus, Gossypium, Poaceae, Asteraceae, Brassicaceae, Solanaceae, Fabaceae, and near the root of the eudicot clade that are consistent with previous genomic evidence. However, a lack of phylogenetic signal within the gene trees can produce erroneous evidence of large-scale duplication events, especially near the root of the species tree. Although the results of our algorithm should be interpreted cautiously, they provide hypotheses for precise locations of large-scale gene duplication events with data from relatively few gene trees and can complement other genomic approaches to provide a more comprehensive view of ancient large-scale gene duplication events.

Asunto(s)

Algoritmos , Evolución Molecular , Duplicación de Gen , Genómica/métodos , Plantas/genética , Poliploidía , Genoma de Planta

Rainbow: a toolbox for phylogenetic supertree construction and analysis.

Chen, D; Eulenstein, O; Fernández-Baca, D.

Bioinformatics ; 20(16): 2872-3, 2004 Nov 01.

Artículo en Inglés | MEDLINE | ID: mdl-15145807

RESUMEN

UNLABELLED: Rainbow is a program that provides a graphic user interface to construct supertrees using different methods. It also provides tools to analyze the quality of the supertrees produced. Rainbow is available for Mac OS X, Windows and Linux. AVAILABILITY: Rainbow is a free open-source software. Its binary files, source code, and manual can be downloaded from the Rainbow web page: http://genome.cs.iastate.edu/Rainbow/

Asunto(s)

Algoritmos , Análisis Mutacional de ADN/métodos , Filogenia , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Interfaz Usuario-Computador , Análisis por Conglomerados , Gráficos por Computador , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos

Investigating evolutionary lines of least resistance using the inverse protein-folding problem.

Schonfeld, J; Eulenstein, O; Velden, K Vander; Naylor, G J P.

Pac Symp Biocomput ; : 613-24, 2002.

Artículo en Inglés | MEDLINE | ID: mdl-11928513

RESUMEN

We present a polynomial time algorithm for estimating optimal HP sequences that fold to a specified target protein conformation based on Sun et al's Grand Canonical (GC) model. Application of the algorithm to related proteins taken from the PDB allows us to explore the nature of the protein genotype:phenotype map. Results suggest: (1) that the GC model captures important biological aspects of the mapping between protein sequences and their corresponding structures, and (2) the set of sequences that map to a target structure with optimal energy is affected by minor differences in structure.

Asunto(s)

Evolución Biológica , Pliegue de Proteína , Proteínas/genética , Proteínas/metabolismo , Algoritmos , Animales , Simulación por Computador , Bases de Datos de Proteínas , Humanos , Lipoma/genética , Fenotipo , Conformación Proteica , Proteínas/química , Homología de Secuencia de Aminoácido , Glycine max/clasificación , Glycine max/genética , Transformación Genética

Towards detection of orthologues in sequence databases.

Yuan, Y P; Eulenstein, O; Vingron, M; Bork, P.

Bioinformatics ; 14(3): 285-9, 1998.

Artículo en Inglés | MEDLINE | ID: mdl-9614272

RESUMEN

MOTIVATION: Numerous homologous sequences from diverse species can be retrieved from databases using programs such as BLAST. However, due to multigene families, evolutionary relationship often cannot be easily determined and proper functional assignment becomes difficult. Thus, discrimination between orthologues and paralogues within BLAST output lists of homologous sequences becomes more and more important. RESULT: We therefore developed a method that attempts to construct a reconciled tree from a gene tree of selected sequences and its corresponding phylogenetic tree of the species involved (species tree). An interface on the Web is developed to enable users to analyse the BLAST result. BLAST outputs are parsed and, for the selected sequences, multiple alignments are constructed either globally or for local regions. Bootstrapped trees are returned and compared with the expected species tree. In cases of discrepancies, gene duplications are assumed and a reconciled tree is computed. The reconciled tree shows probable orthologues and paralogues as predicted.

Asunto(s)

Bases de Datos Factuales , Homología de Secuencia de Ácido Nucleico , Biología Computacional/métodos , Evolución Molecular , Internet , Filogenia , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Validación de Programas de Computación

Duplication-based measures of difference between gene and species trees.

Eulenstein, O; Mirkin, B; Vingron, M.

J Comput Biol ; 5(1): 135-48, 1998.

Artículo en Inglés | MEDLINE | ID: mdl-9541877

RESUMEN

In the framework of a duplication-based method for comparing gene and species trees, the concepts of "duplication" and "loss" are reformulated in set-theoretic terms. A number of related tree dissimilarity measures is suggested, and relations between them are analyzed. For any node in the species tree, the number of gene duplications for which it is a "non-child" loss coincides with the number of times when the node's parent is an intermediate between the mapping images of a gene node and its parent. This implies that the total number of losses is equal to the number of intermediate nodes plus the number of one-side duplications and, thus, provides an alternative proof for a conjecture made by Mirkin, Muchnik, and Smith (1995). Another formula proven involves crossings (incompatible gene-species node pairs): the number of losses equals the number of crossings plus the number of duplications.

Asunto(s)

Evolución Biológica , Genes , Animales , ADN/química , Modelos Genéticos , Familia de Multigenes , Análisis de Secuencia de ADN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA