Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 26(12): 1481-7, 2010 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-20439257

RESUMEN

MOTIVATION: Identifying orthologous genes in multiple genomes is a fundamental task in comparative genomics. Construction of intergenomic symmetrical best matches (SymBets) and joining them into clusters is a popular method of ortholog definition, embodied in several software programs. Despite their wide use, the computational complexity of these programs has not been thoroughly examined. RESULTS: In this work, we show that in the standard approach of iteration through all triangles of SymBets, the memory scales with at least the number of these triangles, O(g(3)) (where g = number of genomes), and construction time scales with the iteration through each pair, i.e. O(g(6)). We propose the EdgeSearch algorithm that iterates over edges in the SymBet graph rather than triangles of SymBets, and as a result has a worst-case complexity of only O(g(3)log g). Several optimizations reduce the run-time even further in realistically sparse graphs. In two real-world datasets of genomes from bacteriophages (POGs) and Mollicutes (MOGs), an implementation of the EdgeSearch algorithm runs about an order of magnitude faster than the original algorithm and scales much better with increasing number of genomes, with only minor differences in the final results, and up to 60 times faster than the popular OrthoMCL program with a 90% overlap between the identified groups of orthologs. AVAILABILITY AND IMPLEMENTATION: C++ source code freely available for download at ftp.ncbi.nih.gov/pub/wolf/COGs/COGsoft/. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online.


Asunto(s)
Algoritmos , Genoma , Genómica/métodos , Análisis por Conglomerados
2.
PLoS One ; 4(4): e5326, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19390636

RESUMEN

A phyletic vector, also known as a phyletic (or phylogenetic) pattern, is a binary representation of the presences and absences of orthologous genes in different genomes. Joint occurrence of two or more genes in many genomes results in closely similar binary vectors representing these genes, and this similarity between gene vectors may be used as a measure of functional association between genes. Better understanding of quantitative properties of gene co-occurrences is needed for systematic studies of gene function and evolution. We used the probabilistic iterative algorithm Psi-square to find groups of similar phyletic vectors. An extended Psi-square algorithm, in which pseudocounts are implemented, shows better sensitivity in identifying proteins with known functional links than our earlier hierarchical clustering approach. At the same time, the specificity of inferring functional associations between genes in prokaryotic genomes is strongly dependent on the pathway: phyletic vectors of the genes involved in energy metabolism and in de novo biosynthesis of the essential precursors tend to be lumped together, whereas cellular modules involved in secretion, motility, assembly of cell surfaces, biosynthesis of some coenzymes, and utilization of secondary carbon sources tend to be identified with much greater specificity. It appears that the network of gene coinheritance in prokaryotes contains a giant connected component that encompasses most biosynthetic subsystems, along with a series of more independent modules involved in cell interaction with the environment.


Asunto(s)
Biología Computacional/métodos , Filogenia , Algoritmos , Análisis por Conglomerados , Evolución Molecular , Genoma
3.
Methods ; 40(4): 303-11, 2006 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-17101441

RESUMEN

Mass spectrometry-based approaches are commonly used to identify proteins from multiprotein complexes, typically with the goal of identifying new complex members or identifying post-translational modifications. However, with the recent demonstration that spectral counting is a powerful quantitative proteomic approach, the analysis of multiprotein complexes by mass spectrometry can be reconsidered in certain cases. Using the chromatography-based approach named multidimensional protein identification technology, multiprotein complexes may be analyzed quantitatively using the normalized spectral abundance factor that allows comparison of multiple independent analyses of samples. This study describes an approach to visualize multiprotein complex datasets that provides structure function information that is superior to tabular lists of data. In this method review, we describe a reanalysis of the Rpd3/Sin3 small and large histone deacetylase complexes previously described in a tabular form to demonstrate the normalized spectral abundance factor approach.


Asunto(s)
Ensamble y Desensamble de Cromatina/fisiología , Bases de Datos de Proteínas , Espectrometría de Masas/métodos , Complejos Multiproteicos/aislamiento & purificación , Procesamiento Proteico-Postraduccional/fisiología , Proteómica/métodos , Proteómica/instrumentación , Relación Estructura-Actividad
4.
J Proteome Res ; 5(9): 2339-47, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16944946

RESUMEN

We have devised an approach for analyzing shotgun proteomics datasets based on the normalized spectral abundance factor that can be used for quantitative proteomics analysis. Three biological replicates of samples enriched for plasma membranes were isolated from S. cerevisiae grown in 14N-rich media and 15N-minimal media and analyzed via quantitative multidimensional protein identification technology. The natural log transformation of NSAF values from S. cerevisiae cells grown in 14N YPD media and 15N-minimal media had a normal distribution. The t-test analysis demonstrated 221 of 1316 proteins were significantly overexpressed in one or the other growth conditions with a p value <0.05. Notably, amino acid transporters were among the 14 membrane proteins that were significantly upregulated in cells grown in minimal media, and we functionally validated these increases in protein expression with radioisotope uptake assays for selected proteins.


Asunto(s)
Regulación Fúngica de la Expresión Génica , Proteínas de la Membrana/metabolismo , Proteómica/métodos , Saccharomyces cerevisiae/metabolismo , Interpretación Estadística de Datos , Radioisótopos de Nitrógeno , Saccharomyces cerevisiae/genética
5.
J Biol Chem ; 280(50): 41207-12, 2005 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-16230350

RESUMEN

The mammalian Tip49a and Tip49b proteins belong to an evolutionarily conserved family of AAA+ ATPases. In Saccharomyces cerevisiae, orthologs of Tip49a and Tip49b, called Rvb1 and Rvb2, respectively, are subunits of two distinct ATP-dependent chromatin remodeling complexes, SWR1 and INO80. We recently demonstrated that the mammalian Tip49a and Tip49b proteins are integral subunits of a chromatin remodeling complex bearing striking similarities to the S. cerevisiae SWR1 complex (Cai, Y., Jin, J., Florens, L., Swanson, S. K., Kusch, T., Li, B., Workman, J. L., Washburn, M. P., Conaway, R. C., and Conaway, J. W. (2005) J. Biol. Chem. 280, 13665-13670). In this report, we identify a new mammalian Tip49a- and Tip49b-containing ATP-dependent chromatin remodeling complex, which includes orthologs of 8 of the 15 subunits of the S. cerevisiae INO80 chromatin remodeling complex as well as at least five additional subunits unique to the human INO80 (hINO80) complex. Finally, we demonstrate that, similar to the yeast INO80 complex, the hINO80 complex exhibits DNA- and nucleosome-activated ATPase activity and catalyzes ATP-dependent nucleosome sliding.


Asunto(s)
Cromatina/química , Proteínas de Saccharomyces cerevisiae/química , ATPasas Asociadas con Actividades Celulares Diversas , Adenosina Trifosfatasas/química , Adenosina Trifosfatasas/metabolismo , Adenosina Trifosfato/química , Proteínas Portadoras/química , Catálisis , Línea Celular , Ensamble y Desensamble de Cromatina , Cromatografía , Cromosomas/ultraestructura , ADN/química , ADN Helicasas/química , ADN Complementario/metabolismo , Electroforesis en Gel de Poliacrilamida , Proteínas Fúngicas/química , Células HeLa , Humanos , Espectrometría de Masas , Nucleosomas/química , Nucleosomas/metabolismo , Unión Proteica , Estructura Terciaria de Proteína , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
6.
Anal Chem ; 77(19): 6218-24, 2005 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-16194081

RESUMEN

In this study, S. cerevisiae crude membrane fractions were prepared using the acid-labile detergent RapiGest from cells grown under rich and minimal media conditions using 14N and 15N ammonium sulfate as the sole nitrogen source. Four independent MudPIT analyses of 1:1 mixtures of sample were prepared and analyzed via quantitative multidimensional protein identification technology on a two-dimensional ion trap mass spectrometer. Using the method described in this study, low-abundance integral membrane proteins with up to 14 transmembrane domains were identified and their protein expression determined when sufficient spectrum counting and ion chromatogram information was generated. We demonstrate that spectrum counting and mass spectrometry derived ion chromatograms strongly correlate for determining quantitative changes in protein expression. Spectrum counting proved more reproducible and has a wider dynamic range contributing to the deviation of the two quantitative approaches from a perfect positive correlation.


Asunto(s)
Cromatografía Liquida/métodos , Espectrometría de Masas/métodos , Péptidos/análisis , Péptidos/química , Proteómica/métodos , Secuencia de Aminoácidos , Marcaje Isotópico , Datos de Secuencia Molecular , Péptidos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA