Búsqueda | Portal Regional de la BVS

Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations.

Leung, Elo; Huang, Amy; Cadag, Eithon; Montana, Aldrin; Soliman, Jan Lorenz; Zhou, Carol L Ecale.

BMC Bioinformatics ; 17: 43, 2016 Jan 20.

Artículo en Inglés | MEDLINE | ID: mdl-26792120

RESUMEN

BACKGROUND: Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. RESULTS: In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. CONCLUSIONS: PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.

Asunto(s)

Genoma Bacteriano , Herbaspirillum/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Internet , Anotación de Secuencia Molecular/métodos , Programas Informáticos , Biología Computacional/métodos , Computadores , Microbiología del Agua

CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments.

Zhou, Carol L Ecale.

Source Code Biol Med ; 10: 9, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26246852

RESUMEN

BACKGROUND: In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. RESULTS: This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CONCLUSIONS: CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.

Computational analysis of pathogen-borne metallo ß-lactamases reveals discriminating structural features between B1 types.

Cadag, Eithon; Vitalis, Elizabeth; Lennox, Kristin P; Zhou, Carol L Ecale; Zemla, Adam T.

BMC Res Notes ; 5: 96, 2012 Feb 14.

Artículo en Inglés | MEDLINE | ID: mdl-22333139

RESUMEN

BACKGROUND: Genes conferring antibiotic resistance to groups of bacterial pathogens are cause for considerable concern, as many once-reliable antibiotics continue to see a reduction in efficacy. The recent discovery of the metallo ß-lactamase blaNDM-1 gene, which appears to grant antibiotic resistance to a variety of Enterobacteriaceae via a mobile plasmid, is one example of this distressing trend. The following work describes a computational analysis of pathogen-borne MBLs that focuses on the structural aspects of characterized proteins. RESULTS: Using both sequence and structural analyses, we examine residues and structural features specific to various pathogen-borne MBL types. This analysis identifies a linker region within MBL-like folds that may act as a discriminating structural feature between these proteins, and specifically resistance-associated acquirable MBLs. Recently released crystal structures of the newly emerged NDM-1 protein were aligned against related MBL structures using a variety of global and local structural alignment methods, and the overall fold conformation is examined for structural conservation. Conservation appears to be present in most areas of the protein, yet is strikingly absent within a linker region, making NDM-1 unique with respect to a linker-based classification scheme. Variability analysis of the NDM-1 crystal structure highlights unique residues in key regions as well as identifying several characteristics shared with other transferable MBLs. CONCLUSIONS: A discriminating linker region identified in MBL proteins is highlighted and examined in the context of NDM-1 and primarily three other MBL types: IMP-1, VIM-2 and ccrA. The presence of an unusual linker region variant and uncommon amino acid composition at specific structurally important sites may help to explain the unusually broad kinetic profile of NDM-1 and may aid in directing research attention to areas of this protein, and possibly other MBLs, that may be targeted for inactivation or attenuation of enzymatic activity.

MannDB - a microbial database of automated protein sequence analyses and evidence integration for protein characterization.

Zhou, Carol L Ecale; Lam, Marisa W; Smith, Jason R; Zemla, Adam T; Dyer, Matthew D; Kuczmarski, Thomas A; Vitalis, Elizabeth A; Slezak, Thomas R.

BMC Bioinformatics ; 7: 459, 2006 Oct 17.

Artículo en Inglés | MEDLINE | ID: mdl-17044936

RESUMEN

BACKGROUND: MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. DESCRIPTION: MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. CONCLUSION: MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports. Access to MannDB is freely available at http://manndb.llnl.gov/.

Asunto(s)

Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Interfaz Usuario-Computador , Algoritmos , Secuencia de Aminoácidos , Proteínas Bacterianas/clasificación , Proteínas Bacterianas/genética , Sitios de Unión , Gráficos por Computador , Sistemas de Administración de Bases de Datos , Internet , Datos de Secuencia Molecular , Unión Proteica , Proteoma/química , Proteoma/clasificación , Proteoma/genética , Proteoma/metabolismo , Programas Informáticos , Integración de Sistemas

Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin.

Zhou, Carol L Ecale; Zemla, Adam T; Roe, Diana; Young, Malin; Lam, Marisa; Schoeniger, Joseph S; Balhorn, Rod.

Bioinformatics ; 21(14): 3089-96, 2005 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-15905278

RESUMEN

MOTIVATION: Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. RESULTS: We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set of ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A, we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics and vaccines.

Asunto(s)

Algoritmos , Modelos Químicos , Modelos Moleculares , Ricina/análisis , Ricina/química , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Sitios de Unión , Simulación por Computador , Secuencia Conservada , Datos de Secuencia Molecular , Unión Proteica , Conformación Proteica , Homología de Secuencia de Aminoácido

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA