RESUMEN
Protein secondary structures are important in many biological processes and applications. Due to advances in sequencing methods, there are many proteins sequenced, but fewer proteins with secondary structures defined by laboratory methods. With the development of computer technology, computational methods have (started to) become the most important methodologies for predicting secondary structures. We evaluated two different approaches to this problem-driven by the recent results obtained by computational methods in this task-(i) template-free classifiers, based on machine learning techniques; and (ii) template-based classifiers, based on searching tools. Both approaches are formed by different sub-classifiers-six for template-free and two for template-based, each with a specific view of the protein. Our results show that these ensembles improve the results of each approach individually.
Asunto(s)
Biología Computacional/métodos , Estructura Secundaria de Proteína , Proteínas/química , Algoritmos , Bases de Datos de Proteínas , Aprendizaje Automático , Redes Neurales de la Computación , Conformación Proteica , Programas InformáticosRESUMEN
Abstract Improving the accuracy of protein secondary structure prediction has been an important task in bioinformatics since it is not only the starting point in obtaining tertiary structure in hierarchical modeling but also enhances sequence analysis and sequence-structure threading to help determine structure and function. Herein we present a model based on DSPRED classifier, a hybrid method composed of dynamic Bayesian networks and a support vector machine to predict 3-state secondary structure information of proteins. We used the SCOPe (Structural Classification of Proteins-extended) database to train and test the model. The results show that DSPRED reached a Q3 accuracy rate of 82.36% when trained and tested using proteins from all SCOPe classes. We compared our method with the popular PSIPRED on the SCOPe test datasets and found that our method outperformed PSIPRED.
Asunto(s)
Estructura Secundaria de Proteína , Máquina de Vectores de Soporte , Inteligencia Artificial , Biología Computacional/métodosRESUMEN
Thioredoxins (Trx) are ubiquitous proteins that regulate several biochemical processes inside the cell. Trx is an important player, displaying oxidoreductase activity and helping to keep and regulate the oxidative state of the cellular environment. Trx also participates in the regulation of many cellular functions, such as DNA synthesis, protection against oxidative stress, cell cycle and signal transduction. The oxidized Trx is the target for another set of proteins, such as thioredoxin reductase (TrR), which used the reductive potential of NADPH. The oxidized state of Trx also plays important role in regulation of redox state in the cells. In this regard, the oxidized form of Trx is a putative conformer that contributes to the cellular redox environment. Here we report the chemical shift assignments (1H, 13C and 15N) in solution at 15 °C. We also showed the secondary structure analysis of the oxidized form of yeast thioredoxin (yTrx1) as basis for future NMR studies of protein-target interactions and dynamics. The assignment was done at low concentration (200 µM) because it is important to keep intact the water cavity.
Asunto(s)
Resonancia Magnética Nuclear Biomolecular , Proteínas de Saccharomyces cerevisiae/química , Saccharomyces cerevisiae , Tiorredoxinas/química , Oxidación-Reducción , SolucionesRESUMEN
The Colletotrichum gloeosporioides species complex is among the most destructive fungal plant pathogens in the world, however, identification of member species which are of quarantine importance is impacted by a number of factors that negatively affect species identification. Structural information of the rRNA marker may be considered to be a conserved marker which can be used as supplementary information for possible species identification. The difficulty in using ITS rDNA sequences for identification lies in the low level of sequence variation at the intra-specific level and the generation of artificially-induced sequence variation due to errors in polymerization of the ITS array during DNA replication. Type and query ITS sequences were subjected to sequence analyses prior to generation of predicted consensus secondary structures, including the pattern of nucleotide polymorphisms and number of indel haplotypes, GC content, and detection of artificially-induced sequence variation. Data pertaining to structure stability, the presence of conserved motifs in secondary structures and mapping of all sequences onto the consensus C. gloeosporioides sensu stricto secondary structure for ITS1, 5.8S and ITS2 markers was then carried out. Motifs that are evolutionarily conserved among eukaryotes were found for all ITS1, 5.8S and ITS2 sequences. The sequences exhibited conserved features typical of functional rRNAs. Generally, polymorphisms occurred within less conserved regions and were seen as bulges, internal and terminal loops or non-canonical G-U base-pairs within regions of the double stranded helices. Importantly, there were also taxonomic motifs and base changes that were unique to specific taxa and which may be used to support intra-specific identification of members of the C. gloeosporioides sensu lato species complex.