RESUMEN
The initial adoption of penicillin as an antibiotic marked the start of exploring other compounds essential for pharmaceuticals, yet resistance to penicillins and their side effects has compromised their efficacy. The N-terminal nucleophile (Ntn) amide-hydrolases S45 family plays a key role in catalyzing amide bond hydrolysis in various compounds, including antibiotics like penicillin and cephalosporin. This study comprehensively analyzes the structural and functional traits of the bacterial N-terminal nucleophile (Ntn) amide-hydrolases S45 family, covering penicillin G acylases, cephalosporin acylases, and D-succinylase. Utilizing structural bioinformatics tools and sequence analysis, the investigation delineates structurally conserved regions (SCRs) and substrate binding site variations among these enzymes. Notably, sixteen SCRs crucial for substrate interaction are identified solely through sequence analysis, emphasizing the significance of sequence data in characterizing functionally relevant regions. These findings introduce a novel approach for identifying targets to enhance the biocatalytic properties of N-terminal nucleophile (Ntn) amide-hydrolases, while facilitating the development of more accurate three-dimensional models, particularly for enzymes lacking structural data. Overall, this research advances our understanding of structure-function relationships in bacterial N-terminal nucleophile (Ntn) amide-hydrolases, providing insights into strategies for optimizing their enzymatic capabilities.
Asunto(s)
Amidohidrolasas , Amidohidrolasas/química , Amidohidrolasas/metabolismo , Amidohidrolasas/genética , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Proteínas Bacterianas/genética , Sitios de Unión , Relación Estructura-Actividad , Secuencia Conservada , Bacterias/enzimología , Secuencia de Aminoácidos , Modelos Moleculares , Especificidad por SustratoRESUMEN
Multivalency in lectins plays a pivotal role in influencing glycan cross-linking, thereby affecting lectin functionality. This multivalency can be achieved through oligomerization, the presence of tandemly repeated carbohydrate recognition domains, or a combination of both. Unlike lectins that rely on multiple factors for the oligomerization of identical monomers, tandem-repeat lectins inherently possess multivalency, independent of this complex process. The repeat domains, although not identical, display slightly distinct specificities within a predetermined geometry, enhancing specificity, affinity, avidity and even oligomerization. Despite the recognition of this structural characteristic in recently discovered lectins by numerous studies, a unified criterion to define tandem-repeat lectins is still necessary. We suggest defining them multivalent lectins with intrachain tandem repeats corresponding to carbohydrate recognition domains, independent of oligomerization. This systematic review examines the folding and phyletic diversity of tandem-repeat lectins and refers to relevant literature. Our study categorizes all lectins with tandemly repeated carbohydrate recognition domains into nine distinct folding classes associated with specific biological functions. Our findings provide a comprehensive description and analysis of tandem-repeat lectins in terms of their functions and structural features. Our exploration of phyletic and functional diversity has revealed previously undocumented tandem-repeat lectins. We propose research directions aimed at enhancing our understanding of the origins of tandem-repeat lectin and fostering the development of medical and biotechnological applications, notably in the design of artificial sugars and neolectins.
Asunto(s)
Lectinas , Secuencias Repetidas en Tándem , Animales , Humanos , Lectinas/química , Lectinas/metabolismoRESUMEN
Deep learning methods, trained on the increasing set of available protein 3D structures and sequences, have substantially impacted the protein modeling and design field. These advancements have facilitated the creation of novel proteins, or the optimization of existing ones designed for specific functions, such as binding a target protein. Despite the demonstrated potential of such approaches in designing general protein binders, their application in designing immunotherapeutics remains relatively unexplored. A relevant application is the design of T cell receptors (TCRs). Given the crucial role of T cells in mediating immune responses, redirecting these cells to tumor or infected target cells through the engineering of TCRs has shown promising results in treating diseases, especially cancer. However, the computational design of TCR interactions presents challenges for current physics-based methods, particularly due to the unique natural characteristics of these interfaces, such as low affinity and cross-reactivity. For this reason, in this study, we explored the potential of two structure-based deep learning protein design methods, ProteinMPNN and ESM-IF, in designing fixed-backbone TCRs for binding target antigenic peptides presented by the MHC through different design scenarios. To evaluate TCR designs, we employed a comprehensive set of sequence- and structure-based metrics, highlighting the benefits of these methods in comparison to classical physics-based design methods and identifying deficiencies for improvement.
RESUMEN
The COVID-19 pandemic evolves constantly, requiring adaptable solutions to combat emerging SARS-CoV-2 variants. To address this, we created a pentameric scaffold based on a mammalian protein, which can be customized with up to 10 protein binding modules. This molecular scaffold spans roughly 20 nm and can simultaneously neutralize SARS-CoV-2 Spike proteins from one or multiple viral particles. Using only two different modules targeting the Spike's RBD domain, this construct outcompetes human antibodies from vaccinated individuals' serum and blocks in vitro cell attachment and pseudotyped virus entry. Additionally, the multibodies inhibit viral replication at low picomolar concentrations, regardless of the variant. This customizable multibody can be easily produced in procaryote systems, providing a new avenue for therapeutic development and detection devices, and contributing to preparedness against rapidly evolving pathogens.
Asunto(s)
COVID-19 , SARS-CoV-2 , Animales , Humanos , Pandemias , Uniones Célula-Matriz , MamíferosRESUMEN
Histamine is a biogenic amine found in fish-derived and fermented food products with physiological relevance since its concentration is proportional to food spoilage and health risk for sensitive consumers. There are various analytical methods for histamine quantification from food samples; however, a simple and quick enzymatic detection and quantification method is highly desirable. Histamine dehydrogenase (HDH) is a candidate for enzymatic histamine detection; however, other biogenic amines can change its activity or produce false positive results with an observed substrate inhibition at higher concentrations. In this work, we studied the effect of site saturation mutagenesis in Rhizobium sp. Histamine Dehydrogenase (Rsp HDH) in nine amino acid positions selected through structural alignment analysis, substrate docking, and proximity to the proposed histamine-binding site. The resulting libraries were screened for histamine and agmatine activity. Variants from two libraries (positions 72 and 110) showed improved histamine/agmatine activity ratio, decreased substrate inhibition, and maintained thermal resistance. In addition, activity characterization of the identified Phe72Thr and Asn110Val HDH variants showed a clear substrate inhibition curve for histamine and modified kinetic parameters. The observed maximum velocity (Vmax) increased for variant Phe72Thr at the cost of an increased value for the Michaelis-Menten constant (Km) for histamine. The increased Km value, decreased substrate inhibition, and biogenic amine interference observed for variant Phe72Thr support a tradeoff between substrate affinity and substrate inhibition in the catalytic mechanism of HDHs. Considering this tradeoff for future enzyme engineering of HDH could lead to breakthroughs in performance increases and understanding of this enzyme class.
Asunto(s)
Agmatina , Rhizobium , Animales , Histamina/metabolismo , Especificidad por Sustrato , Rhizobium/metabolismo , Agmatina/análisis , Aminas Biogénicas/análisis , Calidad de los Alimentos , Ingeniería de ProteínasRESUMEN
Proteins are some of the most fascinating and challenging molecules in the universe, and they pose a big challenge for artificial intelligence. The implementation of machine learning/AI in protein science gives rise to a world of knowledge adventures in the workhorse of the cell and proteome homeostasis, which are essential for making life possible. This opens up epistemic horizons thanks to a coupling of human tacit-explicit knowledge with machine learning power, the benefits of which are already tangible, such as important advances in protein structure prediction. Moreover, the driving force behind the protein processes of self-organization, adjustment, and fitness requires a space corresponding to gigabytes of life data in its order of magnitude. There are many tasks such as novel protein design, protein folding pathways, and synthetic metabolic routes, as well as protein-aggregation mechanisms, pathogenesis of protein misfolding and disease, and proteostasis networks that are currently unexplored or unrevealed. In this systematic review and biochemical meta-analysis, we aim to contribute to bridging the gap between what we call binomial artificial intelligence (AI) and protein science (PS), a growing research enterprise with exciting and promising biotechnological and biomedical applications. We undertake our task by exploring "the state of the art" in AI and machine learning (ML) applications to protein science in the scientific literature to address some critical research questions in this domain, including What kind of tasks are already explored by ML approaches to protein sciences? What are the most common ML algorithms and databases used? What is the situational diagnostic of the AI-PS inter-field? What do ML processing steps have in common? We also formulate novel questions such as Is it possible to discover what the rules of protein evolution are with the binomial AI-PS? How do protein folding pathways evolve? What are the rules that dictate the folds? What are the minimal nuclear protein structures? How do protein aggregates form and why do they exhibit different toxicities? What are the structural properties of amyloid proteins? How can we design an effective proteostasis network to deal with misfolded proteins? We are a cross-functional group of scientists from several academic disciplines, and we have conducted the systematic review using a variant of the PICO and PRISMA approaches. The search was carried out in four databases (PubMed, Bireme, OVID, and EBSCO Web of Science), resulting in 144 research articles. After three rounds of quality screening, 93 articles were finally selected for further analysis. A summary of our findings is as follows: regarding AI applications, there are mainly four types: 1) genomics, 2) protein structure and function, 3) protein design and evolution, and 4) drug design. In terms of the ML algorithms and databases used, supervised learning was the most common approach (85%). As for the databases used for the ML models, PDB and UniprotKB/Swissprot were the most common ones (21 and 8%, respectively). Moreover, we identified that approximately 63% of the articles organized their results into three steps, which we labeled pre-process, process, and post-process. A few studies combined data from several databases or created their own databases after the pre-process. Our main finding is that, as of today, there are no research road maps serving as guides to address gaps in our knowledge of the AI-PS binomial. All research efforts to collect, integrate multidimensional data features, and then analyze and validate them are, so far, uncoordinated and scattered throughout the scientific literature without a clear epistemic goal or connection between the studies. Therefore, our main contribution to the scientific literature is to offer a road map to help solve problems in drug design, protein structures, design, and function prediction while also presenting the "state of the art" on research in the AI-PS binomial until February 2021. Thus, we pave the way toward future advances in the synthetic redesign of novel proteins and protein networks and artificial metabolic pathways, learning lessons from nature for the welfare of humankind. Many of the novel proteins and metabolic pathways are currently non-existent in nature, nor are they used in the chemical industry or biomedical field.
RESUMEN
The stabilization of natural proteins is a long-standing desired goal in protein engineering. Optimizing the hydrophobicity of the protein core often results in extensive stability enhancements. However, the presence of totally or partially buried catalytic charged residues, essential for protein function, has limited the applicability of this strategy. Here, focusing on the thioredoxin, we aimed to augment protein stability by removing buried charged residues in the active site without loss of catalytic activity. To this end, we performed a charged-to-hydrophobic substitution of a buried and functional group, resulting in a significant stability increase yet abolishing catalytic activity. Then, to simulate the catalytic role of the buried ionizable group, we designed a combinatorial library of variants targeting a set of seven surface residues adjacent to the active site. Notably, more than 50% of the library variants restored, to some extent, the catalytic activity. The combination of experimental study of 2% of the library with the prediction of the whole mutational space by partial least squares regression revealed that a single point mutation at the protein surface is sufficient to fully restore the catalytic activity without thermostability cost. As a result, we engineered one of the highest thermal stabilities reported for a protein with a natural occurring fold (137°C). Further, our hyperstable variant preserves the catalytic activity both in vitro and in vivo.
RESUMEN
Effective use of plant biomass as an abundant and renewable feedstock for biofuel production and biorefinery requires efficient enzymatic mobilization of cell wall polymers. Knowledge of plant cell wall composition and architecture has been exploited to develop novel multifunctional enzymes with improved activity against lignocellulose, where a left-handed ß-3-prism synthetic scaffold (BeSS) was designed for insertion of multiple protein domains at the prism vertices. This allowed construction of a series of chimeras fusing variable numbers of a GH11 ß-endo-1,4-xylanase and the CipA-CBM3 with defined distances and constrained relative orientations between catalytic domains. The cellulose binding and endoxylanase activities of all chimeras were maintained. Activity against lignocellulose substrates revealed a rapid 1.6- to 3-fold increase in total reducing saccharide release and increased levels of all major oligosaccharides as measured by polysaccharide analysis using carbohydrate gel electrophoresis (PACE). A construct with CBM3 and GH11 domains inserted in the same prism vertex showed highest activity, demonstrating interdomain geometry rather than number of catalytic sites is important for optimized chimera design. These results confirm that the BeSS concept is robust and can be successfully applied to the construction of multifunctional chimeras, which expands the possibilities for knowledge-based protein design.
RESUMEN
BACKGROUND: Protein-peptide interactions play a fundamental role in a wide variety of biological processes, such as cell signaling, regulatory networks, immune responses, and enzyme inhibition. Peptides are characterized by low toxicity and small interface areas; therefore, they are good targets for therapeutic strategies, rational drug planning and protein inhibition. Approximately 10% of the ethical pharmaceutical market is protein/peptide-based. Furthermore, it is estimated that 40% of protein interactions are mediated by peptides. Despite the fast increase in the volume of biological data, particularly on sequences and structures, there remains a lack of broad and comprehensive protein-peptide databases and tools that allow the retrieval, characterization and understanding of protein-peptide recognition and consequently support peptide design. RESULTS: We introduce Propedia, a comprehensive and up-to-date database with a web interface that permits clustering, searching and visualizing of protein-peptide complexes according to varied criteria. Propedia comprises over 19,000 high-resolution structures from the Protein Data Bank including structural and sequence information from protein-peptide complexes. The main advantage of Propedia over other peptide databases is that it allows a more comprehensive analysis of similarity and redundancy. It was constructed based on a hybrid clustering algorithm that compares and groups peptides by sequences, interface structures and binding sites. Propedia is available through a graphical, user-friendly and functional interface where users can retrieve, and analyze complexes and download each search data set. We performed case studies and verified that the utility of Propedia scores to rank promissing interacting peptides. In a study involving predicting peptides to inhibit SARS-CoV-2 main protease, we showed that Propedia scores related to similarity between different peptide complexes with SARS-CoV-2 main protease are in agreement with molecular dynamics free energy calculation. CONCLUSIONS: Propedia is a database and tool to support structure-based rational design of peptides for special purposes. Protein-peptide interactions can be useful to predict, classifying and scoring complexes or for designing new molecules as well. Propedia is up-to-date as a ready-to-use webserver with a friendly and resourceful interface and is available at: https://bioinfo.dcc.ufmg.br/propedia.
Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos de Proteínas , Péptidos/química , Proteínas/química , Algoritmos , HumanosRESUMEN
Computational protein design is still a challenge for advancing structure-function relationships. While recent advances in this field are promising, more information for genuine predictions is needed. Here, we discuss different approaches applied to install novel glutamine (Gln) binding into the Lysine/Arginine/Ornithine binding protein (LAOBP) from Salmonella typhimurium. We studied the ligand binding behavior of two mutants: a binding pocket grafting design based on a structural superposition of LAOBP to the Gln binding protein QBP from Escherichia coli and a design based on statistical coupled positions. The latter showed the ability to bind Gln even though the protein was not very stable. Comparison of both approaches highlighted a nonconservative shared point mutation between LAOBP_graft and LAOBP_sca. This context dependent L117K mutation in LAOBP turned out to be sufficient for introducing Gln binding, as confirmed by different experimental techniques. Moreover, the crystal structure of LAOBP_L117K in complex with its ligand is reported.
Asunto(s)
Aminoácidos/química , Proteínas Bacterianas/química , Proteínas Portadoras/química , Salmonella typhimurium/química , Proteínas Bacterianas/genética , Sitios de Unión , Proteínas Portadoras/genética , Ligandos , Modelos Moleculares , Mutación , Conformación Proteica , TermodinámicaRESUMEN
The electrostatic potential plays a key role in many biological processes like determining the affinity of a ligand to a given protein target, and they are responsible for the catalytic activity of many enzymes. Understanding the effect that amino acid mutations will have on the electrostatic potential of a protein, will allow a thorough understanding of which residues are the most important in a protein. MutantElec, is a friendly web application for in silico generation of site-directed mutagenesis of proteins and the comparison of electrostatic potential between the wild type protein and the mutant(s), based on the three-dimensional structure of the protein. The effect of the mutation is evaluated using different approach to the traditional surface map. MutantElec provides a graphical display of the results that allows the visualization of changes occurring at close distance from the mutation and thus uncovers the local and global impact of a specific change. © 2017 Wiley Periodicals, Inc.
Asunto(s)
Simulación por Computador , Proteínas Mutantes/química , Proteínas Mutantes/genética , Mutación , Electricidad Estática , Aminoácidos/química , Aminoácidos/genética , Ligandos , Simulación de Dinámica Molecular , Mutagénesis Sitio-Dirigida , Interfaz Usuario-ComputadorRESUMEN
The artificial protein Octarellin V.1 (http://dx.doi.org/10.1016/j.jsb.2016.05.004[1]) was obtained through a direct evolution process over the de novo designed Octarellin V (http://dx.doi.org/10.1016/S0022-2836(02)01206-8[2]). The protein has been characterized by circular dichroism and fluorescence techniques, in order to obtain data related to its thermo and chemical stability. Moreover, the data for the secondary structure content studied by circular dichroism and infra red techniques is reported for the Octarellin V and V.1. Two crystallization helpers, nanobodies (http://dx.doi.org/10.1038/nprot.2014.039[3]) and αRep (http://dx.doi.org/10.1016/j.jmb.2010.09.048[4]), have been used to create stable complexes. Here we present the data obtained of the binding characterization of the Octarellin V.1 with the crystallization helpers by isothermal titration calorimetry.
RESUMEN
Protein glycosylation is a common post-translational modification, the effect of which on protein conformational and stability is incompletely understood. Here we have investigated the effects of glycosylation on the thermostability of Bacillus subtilis xylanase A (XynA) expressed in Pichia pastoris. Intact mass analysis of the heterologous wild-type XynA revealed two, three, or four Hex(8-16)GlcNAc2 modifications involving asparagine residues at positions 20, 25, 141, and 181. Molecular dynamics (MD) simulations of the XynA modified with various combinations of branched Hex9GlcNAc2 at these positions indicated a significant contribution from protein-glycan interactions to the overall energy of the glycoproteins. The effect of glycan content and glycosylation position on protein stability was evaluated by combinatorial mutagenesis of all six potential N-glycosylation sites. The majority of glycosylated enzymes expressed in P. pastoris presented increased thermostability in comparison with their unglycosylated counterparts expressed in Escherichia coli. Steric effects of multiple glycosylation events were apparent, and glycosylation position rather than the number of glycosylation events determined increases in thermostability. The MD simulations also indicated that clustered glycan chains tended to favor less stabilizing glycan-glycan interactions, whereas more dispersed glycosylation patterns favored stabilizing protein-glycan interactions.