Your browser doesn't support javascript.
loading
Overlap and diversity in antimicrobial peptide databases: compiling a non-redundant set of sequences.
Aguilera-Mendoza, Longendri; Marrero-Ponce, Yovani; Tellez-Ibarra, Roberto; Llorente-Quesada, Monica T; Salgado, Jesús; Barigye, Stephen J; Liu, Jun.
Afiliação
  • Aguilera-Mendoza L; Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba.
  • Marrero-Ponce Y; Grupo de Investigación en Estudios Químicos y Biológicos, Facultad de Ciencias Básicas, Universidad Tecnológica de Bolívar, Cartagena de Indias, Bolívar, Colombia, Facultad de Química Farmacéutica, Universidad de Cartagena, Cartagena de Indias, Bolívar, Colombia, Instituto de Ciencia Molecular (ICMo
  • Tellez-Ibarra R; Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba.
  • Llorente-Quesada MT; Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba.
  • Salgado J; Instituto de Ciencia Molecular (ICMol), Universitat de València, C/ Catedrático José Beltrán, 2, 46980, Paterna (Valencia), Spain.
  • Barigye SJ; Departamento de Química, Universidade Federal de Lavras, UFLA Caixa Postal 3037, 37200-000 Lavras, MG, Brazil and.
  • Liu J; School of Computing and Mathematics, Faculty of Computing and Engineering, Ulster University, Jordanstown campus, Northern Ireland, UK.
Bioinformatics ; 31(15): 2553-9, 2015 Aug 01.
Article em En | MEDLINE | ID: mdl-25819673
MOTIVATION: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. RESULTS: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are included in CAMP_Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de Proteína / Peptídeos Catiônicos Antimicrobianos / Bases de Dados de Ácidos Nucleicos / Bases de Dados de Proteínas Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article País de afiliação: Cuba País de publicação: Reino Unido

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de Proteína / Peptídeos Catiônicos Antimicrobianos / Bases de Dados de Ácidos Nucleicos / Bases de Dados de Proteínas Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article País de afiliação: Cuba País de publicação: Reino Unido