Búsqueda | Portal Regional de la BVS

1.

The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset.

Banerjee, Arkaprava; Roy, Kunal.

Sci Rep ; 14(1): 20812, 2024 09 06.

Artículo en Inglés | MEDLINE | ID: mdl-39242880

RESUMEN

With the exponential progress in the field of cheminformatics, the conventional modeling approaches have so far been to employ supervised and unsupervised machine learning (ML) and deep learning models, utilizing the standard molecular descriptors, which represent the structural, physicochemical, and electronic properties of a particular compound. Deviating from the conventional approach, in this investigation, we have employed the classification Read-Across Structure-Activity Relationship (c-RASAR), which involves the amalgamation of the concepts of classification-based quantitative structure-activity relationship (QSAR) and Read-Across to incorporate Read-Across-derived similarity and error-based descriptors into a statistical and machine learning modeling framework. ML models developed from these RASAR descriptors use similarity-based information from the close source neighbors of a particular query compound. We have employed different classification modeling algorithms on the selected QSAR and RASAR descriptors to develop predictive models for efficient prediction of query compounds' hepatotoxicity. The predictivity of each of these models was evaluated on a large number of test set compounds. The best-performing model was also used to screen a true external data set. The concepts of explainable AI (XAI) coupled with Read-Across were used to interpret the contributions of the RASAR descriptors in the best c-RASAR model and to explain the chemical diversity in the dataset. The application of various unsupervised dimensionality reduction techniques like t-SNE and UMAP and the supervised ARKA framework showed the usefulness of the RASAR descriptors over the selected QSAR descriptors in their ability to group similar compounds, enhancing the modelability of the dataset and efficiently identifying activity cliffs. Furthermore, the activity cliffs were also identified from Read-Across by observing the nature of compounds constituting the nearest neighbors for a particular query compound. On comparing our simple linear c-RASAR model with the previously reported models developed using the same dataset derived from the US FDA Orange Book ( https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm ), it was observed that our model is simple, reproducible, transferable, and highly predictive. The performance of the LDA c-RASAR model on the true external set supersedes that of the previously reported work. Therefore, the present simple LDA c-RASAR model can efficiently be used to predict the hepatotoxicity of query chemicals.

Asunto(s)

Enfermedad Hepática Inducida por Sustancias y Drogas , Relación Estructura-Actividad Cuantitativa , Enfermedad Hepática Inducida por Sustancias y Drogas/etiología , Algoritmos , Aprendizaje Automático , Humanos , Quimioinformática/métodos

2.

Cheminformatics analysis of indoleamine and tryptophan 2,3-dioxygenase inhibitors: A descriptor and fingerprint based machine learning approach to disclose selectivity measures.

Irannejad, Hamid; Valipour, Mehdi.

Comput Biol Med ; 180: 108954, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-39094327

RESUMEN

Indoleamine 2,3-dioxygenase (IDO) and tryptophan 2,3-dioxygenase (TDO) are attractive drug targets for cancer immunotherapy. After disappointing results of the epacadostat as a selective IDO inhibitor in phase III clinical trials, there is much interest in the development of the TDO selective inhibitors. In the current study, several data analysis methods and machine learning approaches including logistic regression, Random Forest, XGBoost and Support Vector Machines were used to model a data set of compounds retrieved from ChEMBL. Models based on the Morgan fingerprints revealed notable fragments for the selective inhibition of the IDO, TDO or both. Multiple fragment docking was performed to find the best set of bound fragments and their orientation in the space for efficient linking. Linking the fragments and optimization of the final molecules were accomplished by means of an artificial intelligence generative framework. Finally, selectivity of the optimized molecules was assessed and the top 4 lead molecules were filtered through PAINS, Brenk and NIH filters. Results indicated that phenyloxalamide, fluoroquinoline, and 3-bromo-4-fluroaniline confer selectivity towards the IDO inhibition. Correspondingly, 1-benzyl-1H-naphtho[2,3-d][1,2,3]triazole-4,9-dione was found to be an integral fragment for the selective inhibition of the TDO by constituting a coordination bond with the Fe atom of heme. In addition, furo[2,3-c]pyridine-2,3-diamine was found as a common fragment for inhibition of the both targets and can be used in the design of the dual target inhibitors of the IDO and TDO. The new fragments introduced here can be a useful building blocks for incorporation into the selective TDO or dual IDO/TDO inhibitors.

Asunto(s)

Quimioinformática , Inhibidores Enzimáticos , Indolamina-Pirrol 2,3,-Dioxigenasa , Aprendizaje Automático , Triptófano Oxigenasa , Indolamina-Pirrol 2,3,-Dioxigenasa/antagonistas & inhibidores , Indolamina-Pirrol 2,3,-Dioxigenasa/química , Indolamina-Pirrol 2,3,-Dioxigenasa/metabolismo , Triptófano Oxigenasa/antagonistas & inhibidores , Triptófano Oxigenasa/metabolismo , Triptófano Oxigenasa/química , Humanos , Quimioinformática/métodos , Inhibidores Enzimáticos/química , Simulación del Acoplamiento Molecular

3.

Cheminformatics-Guided Exploration of Synthetic Marine Natural Product-Inspired Brominated Indole-3-Glyoxylamides and Their Potentials for Drug Discovery.

Holland, Darren C; Prebble, Dale W; Calcott, Mark J; Schroder, Wayne A; Ferretti, Francesca; Lock, Aaron; Avery, Vicky M; Kiefel, Milton J; Carroll, Anthony R.

Molecules ; 29(15)2024 Aug 01.

Artículo en Inglés | MEDLINE | ID: mdl-39125052

RESUMEN

Marine natural products (MNPs) continue to be tested primarily in cellular toxicity assays, both mammalian and microbial, despite most being inactive at concentrations relevant to drug discovery. These MNPs become missed opportunities and represent a wasteful use of precious bioresources. The use of cheminformatics aligned with published bioactivity data can provide insights to direct the choice of bioassays for the evaluation of new MNPs. Cheminformatics analysis of MNPs found in MarinLit (n = 39,730) up to the end of 2023 highlighted indol-3-yl-glyoxylamides (IGAs, n = 24) as a group of MNPs with no reported bioactivities. However, a recent review of synthetic IGAs highlighted these scaffolds as privileged structures with several compounds under clinical evaluation. Herein, we report the synthesis of a library of 32 MNP-inspired brominated IGAs (25-56) using a simple one-pot, multistep method affording access to these diverse chemical scaffolds. Directed by a meta-analysis of the biological activities reported for marine indole alkaloids (MIAs) and synthetic IGAs, the brominated IGAs 25-56 were examined for their potential bioactivities against the Parkinson's Disease amyloid protein alpha synuclein (α-syn), antiplasmodial activities against chloroquine-resistant (3D7) and sensitive (Dd2) parasite strains of Plasmodium falciparum, and inhibition of mammalian (chymotrypsin and elastase) and viral (SARS-CoV-2 3CLpro) proteases. All of the synthetic IGAs tested exhibited binding affinity to the amyloid protein α-syn, while some showed inhibitory activities against P. falciparum, and the proteases, SARS-CoV-2 3CLpro, and chymotrypsin. The cellular safety of the IGAs was examined against cancerous and non-cancerous human cell lines, with all of the compounds tested inactive, thereby validating cheminformatics and meta-analyses results. The findings presented herein expand our knowledge of marine IGA bioactive chemical space and advocate expanding the scope of biological assays routinely used to investigate NP bioactivities, specifically those more suitable for non-toxic compounds. By integrating cheminformatics tools and functional assays into NP biological testing workflows, we can aim to enhance the potential of NPs and their scaffolds for future drug discovery and development.

Asunto(s)

Productos Biológicos , Quimioinformática , Descubrimiento de Drogas , Productos Biológicos/química , Productos Biológicos/farmacología , Humanos , Quimioinformática/métodos , SARS-CoV-2/efectos de los fármacos , Organismos Acuáticos/química , Indoles/química , Indoles/farmacología , Plasmodium falciparum/efectos de los fármacos , Alcaloides Indólicos/farmacología , Alcaloides Indólicos/química , Animales

4.

Cheminformatics-based analysis identified (Z)-2-(2,5-dimethoxy benzylidene)-6-(2-(4-methoxyphenyl)-2-oxoethoxy) benzofuran-3(2H)-one as an inhibitor of Marburg replication by interacting with NP.

Siddiquee, Noimul Hasan; Talukder, Md Enamul Kabir; Ahmed, Ezaz; Zeba, Labiba Tasnim; Aivy, Farjana Sultana; Rahman, Md Hasibur; Barua, Durjoy; Rumman, Rahnumazzaman; Hossain, Md Ifteker; Shimul, Md Ebrahim Khalil; Rama, Anika Rahman; Chowdhury, Sristi; Hossain, Imam.

Microb Pathog ; 195: 106892, 2024 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-39216611

RESUMEN

The highly pathogenic Marburg virus (MARV) is a member of the Filoviridae family, a non-segmented negative-strand RNA virus. This article represents the computer-aided drug design (CADD) approach for identifying drug-like compounds that prevent the MARV virus disease by inhibiting nucleoprotein, which is responsible for their replication. This study used a wide range of in silico drug design techniques to identify potential drugs. Out of 368 natural compounds, 202 compounds passed ADMET, and molecular docking identified the top two molecules (CID: 1804018 and 5280520) with a high binding affinity of -6.77 and -6.672 kcal/mol, respectively. Both compounds showed interactions with the common amino acid residues SER_216, ARG_215, TYR_135, CYS_195, and ILE_108, which indicates that lead compounds and control ligands interact in the common active site/catalytic site of the protein. The negative binding free energies of CID: 1804018 and 5280520 were -66.01 and -31.29 kcal/mol, respectively. Two lead compounds were re-evaluated using MD modeling techniques, which confirmed CID: 1804018 as the most stable when complexed with the target protein. PC3 of the (Z)-2-(2,5-dimethoxybenzylidene)-6-(2-(4-methoxyphenyl)-2-oxoethoxy) benzofuran-3(2H)-one (CID: 1804018) was 8.74 %, whereas PC3 of the 2'-Hydroxydaidzein (CID: 5280520) was 11.25 %. In this study, (Z)-2-(2,5-dimethoxybenzylidene)-6-(2-(4-methoxyphenyl)-2-oxoethoxy) benzofuran-3(2H)-one (CID: 1804018) unveiled the significant stability of the proteins' binding site in ADMET, Molecular docking, MM-GBSA and MD simulation analysis studies, which also showed a high negative binding free energy value, confirming as the best drug candidate which is found in Angelica archangelica which may potentially inhibit the replication of MARV nucleoprotein.

Asunto(s)

Antivirales , Benzofuranos , Marburgvirus , Simulación del Acoplamiento Molecular , Replicación Viral , Antivirales/farmacología , Antivirales/química , Antivirales/metabolismo , Marburgvirus/efectos de los fármacos , Marburgvirus/metabolismo , Benzofuranos/farmacología , Benzofuranos/química , Benzofuranos/metabolismo , Replicación Viral/efectos de los fármacos , Quimioinformática/métodos , Diseño de Fármacos , Unión Proteica , Proteínas de Unión al ARN/metabolismo , Proteínas de Unión al ARN/química , Sitios de Unión , Ligandos

5.

Neglected Tropical Diseases: A Chemoinformatics Approach for the Use of Biodiversity in Anti-Trypanosomatid Drug Discovery.

Valli, Marilia; Döring, Thiago H; Marx, Edgard; Ferreira, Leonardo L G; Medina-Franco, José L; Andricopulo, Adriano D.

Biomolecules ; 14(8)2024 Aug 20.

Artículo en Inglés | MEDLINE | ID: mdl-39199420

RESUMEN

The development of new treatments for neglected tropical diseases (NTDs) remains a major challenge in the 21st century. In most cases, the available drugs are obsolete and have limitations in terms of efficacy and safety. The situation becomes even more complex when considering the low number of new chemical entities (NCEs) currently in use in advanced clinical trials for most of these diseases. Natural products (NPs) are valuable sources of hits and lead compounds with privileged scaffolds for the discovery of new bioactive molecules. Considering the relevance of biodiversity for drug discovery, a chemoinformatics analysis was conducted on a compound dataset of NPs with anti-trypanosomatid activity reported in 497 research articles from 2019 to 2024. Structures corresponding to different metabolic classes were identified, including terpenoids, benzoic acids, benzenoids, steroids, alkaloids, phenylpropanoids, peptides, flavonoids, polyketides, lignans, cytochalasins, and naphthoquinones. This unique collection of NPs occupies regions of the chemical space with drug-like properties that are relevant to anti-trypanosomatid drug discovery. The gathered information greatly enhanced our understanding of biologically relevant chemical classes, structural features, and physicochemical properties. These results can be useful in guiding future medicinal chemistry efforts for the development of NP-inspired NCEs to treat NTDs caused by trypanosomatid parasites.

Asunto(s)

Biodiversidad , Productos Biológicos , Quimioinformática , Descubrimiento de Drogas , Enfermedades Desatendidas , Animales , Humanos , Productos Biológicos/química , Productos Biológicos/farmacología , Productos Biológicos/uso terapéutico , Quimioinformática/métodos , Descubrimiento de Drogas/métodos , Enfermedades Desatendidas/tratamiento farmacológico , Tripanocidas/química , Tripanocidas/farmacología , Tripanocidas/uso terapéutico , Trypanosoma/efectos de los fármacos

6.

Navigating pharmacophore space to identify activity discontinuities: A case study with BCR-ABL.

Lejmi, Maroua; Geslin, Damien; Bureau, Ronan; Cuissart, Bertrand; Ben Slima, Ilef; Meddouri, Nida; Borgi, Amel; Lamotte, Jean-Luc; Lepailleur, Alban.

Mol Inform ; 43(8): e202400050, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38979846

RESUMEN

The exploration of chemical space is a fundamental aspect of chemoinformatics, particularly when one explores a large compound data set to relate chemical structures with molecular properties. In this study, we extend our previous work on chemical space visualization at the pharmacophoric level. Instead of using conventional binary classification of affinity (active vs inactive), we introduce a refined approach that categorizes compounds into four distinct classes based on their activity levels: super active, very active, active, and inactive. This classification enriches the color scheme applied to pharmacophore space, where the color representation of a pharmacophore hypothesis is driven by the associated compounds. Using the BCR-ABL tyrosine kinase as a case study, we identified intriguing regions corresponding to pharmacophore activity discontinuities, providing valuable insights for structure-activity relationships analysis.

Asunto(s)

Proteínas de Fusión bcr-abl , Inhibidores de Proteínas Quinasas , Proteínas de Fusión bcr-abl/antagonistas & inhibidores , Proteínas de Fusión bcr-abl/química , Inhibidores de Proteínas Quinasas/química , Inhibidores de Proteínas Quinasas/farmacología , Relación Estructura-Actividad , Humanos , Quimioinformática/métodos , Farmacóforo

7.

Versatile Deep Learning Pipeline for Transferable Chemical Data Extraction.

Alshehri, Abdulelah S; Horstmann, Kai A; You, Fengqi.

J Chem Inf Model ; 64(15): 5888-5899, 2024 Aug 12.

Artículo en Inglés | MEDLINE | ID: mdl-39009039

RESUMEN

Chemical information disseminated in scientific documents offers an untapped potential for deep learning-assisted insights and breakthroughs. Automated extraction efforts have shifted from resource-intensive manual extraction toward applying machine learning methods to streamline chemical data extraction. While current extraction models and pipelines have ushered in notable efficiency improvements, they often exhibit modest performance, compromising the accuracy of predictive models trained on extracted data. Further, current chemical pipelines lack both transferabilityâwhere a model trained on one task can be adapted to another relevant task with limited examplesâand extensibility, which enables seamless adaptability for new extraction tasks. Addressing these gaps, we present ChemREL, a versatile chemical data extraction pipeline emphasizing performance, transferability, and extensibility. ChemREL utilizes a custom, diverse data set of chemical documents, labeled through an active learning strategy to extract two properties: normal melting point and lethal dose 50 (LD50). The normal melting point is selected for its prevalence in diverse contexts and wider literature, serving as the foundation for pipeline training. In contrast, LD50 evaluates the pipeline's transferability to an unrelated property, underscoring variance in its biological nature, toxicological context, and units, among other differences. With pretraining and fine-tuning, our pipeline outperforms existing methods and GPT-4, achieving F1-scores of 96.1% for entity identification and 97.0% for relation mapping, culminating in an overall F1-score of 95.4%. More importantly, ChemREL displays high transferability, effectively transitioning from melting point extraction to LD50 extraction with 10 randomly selected training documents. Released as an open-source package, ChemREL aims to broaden access to chemical data extraction, enabling the construction of expansive relational data sets that propel discovery.

Asunto(s)

Aprendizaje Profundo , Minería de Datos/métodos , Quimioinformática/métodos

8.

How to correctly develop q-RASAR models for predictive cheminformatics.

Banerjee, Arkaprava; Roy, Kunal.

Expert Opin Drug Discov ; 19(9): 1017-1022, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-38966910

Asunto(s)

Quimioinformática , Humanos , Quimioinformática/métodos , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Relación Estructura-Actividad Cuantitativa , Desarrollo de Medicamentos/métodos

9.

Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery.

Manen-Freixa, Leticia; Antolin, Albert A.

Expert Opin Drug Discov ; 19(9): 1043-1069, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-39004919

RESUMEN

INTRODUCTION: Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED: This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION: Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.

Asunto(s)

Algoritmos , Biología Computacional , Descubrimiento de Drogas , Polifarmacología , Humanos , Descubrimiento de Drogas/métodos , Biología Computacional/métodos , Quimioinformática/métodos , Animales

10.

Stereoisomers Are Not Machine Learning's Best Friends.

Tahil, Gökhan; Delorme, Fabien; Le Berre, Daniel; Monflier, Éric; Sayede, Adlane; Tilloy, Sébastien.

J Chem Inf Model ; 64(14): 5451-5469, 2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-38949069

RESUMEN

This study addresses the challenge of accurately identifying stereoisomers in cheminformatics, which originates from our objective to apply machine learning to predict the association constant between cyclodextrin and a guest. Identifying stereoisomers is indeed crucial for machine learning applications. Current tools offer various molecular descriptors, including their textual representation as Isomeric SMILES that can distinguish stereoisomers. However, such representation is text-based and does not have a fixed size, so a conversion is needed to make it usable to machine learning approaches. Word embedding techniques can be used to solve this problem. Mol2vec, a word embedding approach for molecules, offers such a conversion. Unfortunately, it cannot distinguish between stereoisomers due to its inability to capture the spatial configuration of molecular structures. This study proposes several approaches that use word embedding techniques to handle molecular discrimination using stereochemical information on molecules or considering Isomeric SMILES notation as a text in Natural Language Processing. Our aim is to generate a distinct vector for each unique molecule, correctly identifying stereoisomer information in cheminformatics. The proposed approaches are then compared to our original machine learning task: predicting the association constant between cyclodextrin and a guest molecule.

Asunto(s)

Aprendizaje Automático , Estereoisomerismo , Quimioinformática/métodos , Ciclodextrinas/química , Procesamiento de Lenguaje Natural

11.

uafR: An R package that automates mass spectrometry data processing.

Stratton, Chase A; Thompson, Yvonne; Zio, Konilo; Morrison, William R; Murrell, Ebony G.

PLoS One ; 19(7): e0306202, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38968199

RESUMEN

Chemical information has become increasingly ubiquitous and has outstripped the pace of analysis and interpretation. We have developed an R package, uafR, that automates a grueling retrieval process for gas -chromatography coupled mass spectrometry (GC -MS) data and allows anyone interested in chemical comparisons to quickly perform advanced structural similarity matches. Our streamlined cheminformatics workflows allow anyone with basic experience in R to pull out component areas for tentative compound identifications using the best published understanding of molecules across samples (pubchem.gov). Interpretations can now be done at a fraction of the time, cost, and effort it would typically take using a standard chemical ecology data analysis pipeline. The package was tested in two experimental contexts: (1) A dataset of purified internal standards, which showed our algorithms correctly identified the known compounds with R2 values ranging from 0.827-0.999 along concentrations ranging from 1 × 10-5 to 1 × 103 ng/µl, (2) A large, previously published dataset, where the number and types of compounds identified were comparable (or identical) to those identified with the traditional manual peak annotation process, and NMDS analysis of the compounds produced the same pattern of significance as in the original study. Both the speed and accuracy of GC -MS data processing are drastically improved with uafR because it allows users to fluidly interact with their experiment following tentative library identifications [i.e. after the m/z spectra have been matched against an installed chemical fragmentation database (e.g. NIST)]. Use of uafR will allow larger datasets to be collected and systematically interpreted quickly. Furthermore, the functions of uafR could allow backlogs of previously collected and annotated data to be processed by new personnel or students as they are being trained. This is critical as we enter the era of exposomics, metabolomics, volatilomes, and landscape level, high-throughput chemotyping. This package was developed to advance collective understanding of chemical data and is applicable to any research that benefits from GC -MS analysis. It can be downloaded for free along with sample datasets from Github at github.org/castratton/uafR or installed directly from R or RStudio using the developer tools: 'devtools::install_github("castratton/uafR")'.

Asunto(s)

Algoritmos , Cromatografía de Gases y Espectrometría de Masas , Programas Informáticos , Cromatografía de Gases y Espectrometría de Masas/métodos , Quimioinformática/métodos

12.

OpenChemIE: An Information Extraction Toolkit for Chemistry Literature.

Fan, Vincent; Qian, Yujie; Wang, Alex; Wang, Amber; Coley, Connor W; Barzilay, Regina.

J Chem Inf Model ; 64(14): 5521-5534, 2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-38950894

RESUMEN

Information extraction from chemistry literature is vital for constructing up-to-date reaction databases for data-driven chemistry. Complete extraction requires combining information across text, tables, and figures, whereas prior work has mainly investigated extracting reactions from single modalities. In this paper, we present OpenChemIE to address this complex challenge and enable the extraction of reaction data at the document level. OpenChemIE approaches the problem in two steps: extracting relevant information from individual modalities and then integrating the results to obtain a final list of reactions. For the first step, we employ specialized neural models that each address a specific task for chemistry information extraction, such as parsing molecules or reactions from text or figures. We then integrate the information from these modules using chemistry-informed algorithms, allowing for the extraction of fine-grained reaction data from reaction condition and substrate scope investigations. Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69.5%. Additionally, the reaction extraction results of OpenChemIE attain an accuracy score of 64.3% when directly compared against the Reaxys chemical database. OpenChemIE is most suited for information extraction on organic chemistry literature, where molecules are generally depicted as planar graphs or written in text and can be consolidated into a SMILES format. We provide OpenChemIE freely to the public as an open-source package, as well as through a web interface.

Asunto(s)

Aprendizaje Automático , Minería de Datos/métodos , Bases de Datos de Compuestos Químicos , Algoritmos , Quimioinformática/métodos

13.

Chemoinformatics Insights on Molecular Jackhammers and Cancer Cells.

Ayala-Orozco, Ciceron; Teimouri, Hamid; Medvedeva, Angela; Li, Bowen; Lathem, Alex; Li, Gang; Kolomeisky, Anatoly B; Tour, James M.

J Chem Inf Model ; 64(14): 5570-5579, 2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-38958581

RESUMEN

One of the most challenging tasks in modern medicine is to find novel efficient cancer therapeutic methods with minimal side effects. The recent discovery of several classes of organic molecules known as "molecular jackhammers" is a promising development in this direction. It is known that these molecules can directly target and eliminate cancer cells with no impact on healthy tissues. However, the underlying microscopic picture remains poorly understood. We present a study that utilizes theoretical analysis together with experimental measurements to clarify the microscopic aspects of jackhammers' anticancer activities. Our physical-chemical approach combines statistical analysis with chemoinformatics methods to design and optimize molecular jackhammers. By correlating specific physical-chemical properties of these molecules with their abilities to kill cancer cells, several important structural features are identified and discussed. Although our theoretical analysis enhances understanding of the molecular interactions of jackhammers, it also highlights the need for further research to comprehensively elucidate their mechanisms and to develop a robust physical-chemical framework for the rational design of targeted anticancer drugs.

Asunto(s)

Antineoplásicos , Quimioinformática , Humanos , Antineoplásicos/farmacología , Antineoplásicos/química , Quimioinformática/métodos , Neoplasias/tratamiento farmacológico , Neoplasias/patología , Línea Celular Tumoral , Modelos Moleculares

14.

Can large language models understand molecules?

Sadeghi, Shaghayegh; Bui, Alan; Forooghi, Ali; Lu, Jianguo; Ngom, Alioune.

BMC Bioinformatics ; 25(1): 225, 2024 Jun 26.

Artículo en Inglés | MEDLINE | ID: mdl-38926641

RESUMEN

PURPOSE: Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) from OpenAI and LLaMA (Large Language Model Meta AI) from Meta AI are increasingly recognized for their potential in the field of cheminformatics, particularly in understanding Simplified Molecular Input Line Entry System (SMILES), a standard method for representing chemical structures. These LLMs also have the ability to decode SMILES strings into vector representations. METHOD: We investigate the performance of GPT and LLaMA compared to pre-trained models on SMILES in embedding SMILES strings on downstream tasks, focusing on two key applications: molecular property prediction and drug-drug interaction prediction. RESULTS: We find that SMILES embeddings generated using LLaMA outperform those from GPT in both molecular property and DDI prediction tasks. Notably, LLaMA-based SMILES embeddings show results comparable to pre-trained models on SMILES in molecular prediction tasks and outperform the pre-trained models for the DDI prediction tasks. CONCLUSION: The performance of LLMs in generating SMILES embeddings shows great potential for further investigation of these models for molecular embedding. We hope our study bridges the gap between LLMs and molecular embedding, motivating additional research into the potential of LLMs in the molecular representation field. GitHub: https://github.com/sshaghayeghs/LLaMA-VS-GPT .

Asunto(s)

Quimioinformática , Quimioinformática/métodos , Interacciones Farmacológicas , Estructura Molecular

15.

Synthesis and Cheminformatics-Directed Antibacterial Evaluation of Echinosulfonic Acid-Inspired Bis-Indole Alkaloids.

Holland, Darren C; Hayton, Joshua B; Kiefel, Milton J; Carroll, Anthony R.

Molecules ; 29(12)2024 Jun 12.

Artículo en Inglés | MEDLINE | ID: mdl-38930871

RESUMEN

Synthetic efforts toward complex natural product (NP) scaffolds are useful ones, particularly those aimed at expanding their bioactive chemical space. Here, we utilised an orthogonal cheminformatics-based approach to predict the potential biological activities for a series of synthetic bis-indole alkaloids inspired by elusive sponge-derived NPs, echinosulfone A (1) and echinosulfonic acids A-D (2-5). Our work includes the first synthesis of desulfato-echinosulfonic acid C, an α-hydroxy bis(3'-indolyl) alkaloid (17), and its full NMR characterisation. This synthesis provides corroborating evidence for the structure revision of echinosulfonic acids A-C. Additionally, we demonstrate a robust synthetic strategy toward a diverse range of α-methine bis(3'-indolyl) acids and acetates (11-16) without the need for silica-based purification in either one or two steps. By integrating our synthetic library of bis-indoles with bioactivity data for 2048 marine indole alkaloids (reported up to the end of 2021), we analyzed their overlap with marine natural product chemical diversity. Notably, the C-6 dibrominated α-hydroxy bis(3'-indolyl) and α-methine bis(3'-indolyl) analogues (11, 14, and 17) were found to contain significant overlap with antibacterial C-6 dibrominated marine bis-indoles, guiding our biological evaluation. Validating the results of our cheminformatics analyses, the dibrominated α-methine bis(3'-indolyl) alkaloids (11, 12, 14, and 15) were found to exhibit antibacterial activities against methicillin-sensitive and -resistant Staphylococcus aureus. Further, while investigating other synthetic approaches toward bis-indole alkaloids, 16 incorrectly assigned synthetic α-hydroxy bis(3'-indolyl) alkaloids were identified. After careful analysis of their reported NMR data, and comparison with those obtained for the synthetic bis-indoles reported herein, all of the structures have been revised to α-methine bis(3'-indolyl) alkaloids.

Asunto(s)

Antibacterianos , Quimioinformática , Alcaloides Indólicos , Antibacterianos/farmacología , Antibacterianos/química , Antibacterianos/síntesis química , Alcaloides Indólicos/química , Alcaloides Indólicos/farmacología , Alcaloides Indólicos/síntesis química , Quimioinformática/métodos , Pruebas de Sensibilidad Microbiana , Estructura Molecular , Relación Estructura-Actividad , Productos Biológicos/química , Productos Biológicos/farmacología , Productos Biológicos/síntesis química

16.

Exploration of Type III effector Xanthomonas outer protein Q (XopQ) inhibitor from Picrasma quassioides as an antibacterial agent using chemoinformatics analysis.

Revanasiddappa, Prasanna D; Gowtham, H G; G S, Chikkanna; Gangadhar, Suchithra; A, Satish; Murali, M; Shivamallu, Chandan; Achar, Raghu Ram; Silina, Ekaterina; Stupin, Victor; Manturova, Natalia; Shati, Ali A; Alfaifi, Mohammad Y; Elbehairi, Serag Eldin I; Kollur, Shiva Prasad; Amruthesh, Kestur Nagaraj.

PLoS One ; 19(6): e0302105, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38889115

RESUMEN

The present study was focused on exploring the efficient inhibitors of closed state (form) of type III effector Xanthomonas outer protein Q (XopQ) (PDB: 4P5F) from the 44 phytochemicals of Picrasma quassioides using cutting-edge computational analysis. Among them, Kumudine B showed excellent binding energy (-11.0 kcal/mol), followed by Picrasamide A, Quassidine I and Quassidine J with the targeted closed state of XopQ protein compared to the reference standard drug (Streptomycin). The molecular dynamics (MD) simulations performed at 300 ns validated the stability of top lead ligands (Kumudine B, Picrasamide A, and Quassidine I)-bound XopQ protein complex with slightly lower fluctuation than Streptomycin. The MM-PBSA calculation confirmed the strong interactions of top lead ligands (Kumudine B and QuassidineI) with XopQ protein, as they offered the least binding energy. The results of absorption, distribution, metabolism, excretion, and toxicity (ADMET) analysis confirmed that Quassidine I, Kumudine B and Picrasamide A were found to qualify most of the drug-likeness rules with excellent bioavailability scores compared to Streptomycin. Results of the computational studies suggested that Kumudine B, Picrasamide A, and Quassidine I could be considered potential compounds to design novel antibacterial drugs against X. oryzae infection. Further in vitro and in vivo antibacterial activities of Kumudine B, Picrasamide A, and Quassidine I are required to confirm their therapeutic potentiality in controlling the X. oryzae infection.

Asunto(s)

Antibacterianos , Simulación de Dinámica Molecular , Xanthomonas , Antibacterianos/farmacología , Antibacterianos/química , Xanthomonas/efectos de los fármacos , Quimioinformática/métodos , Simulación del Acoplamiento Molecular , Proteínas Bacterianas/antagonistas & inhibidores , Proteínas Bacterianas/metabolismo , Proteínas Bacterianas/química

17.

Application of Transformers in Cheminformatics.

Luong, Kha-Dinh; Singh, Ambuj.

J Chem Inf Model ; 64(11): 4392-4409, 2024 Jun 10.

Artículo en Inglés | MEDLINE | ID: mdl-38815246

RESUMEN

By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.

Asunto(s)

Quimioinformática , Aprendizaje Automático , Quimioinformática/métodos

18.

Chemoinformatic regression methods and their applicability domain.

Dutschmann, Thomas-Martin; Schlenker, Valerie; Baumann, Knut.

Mol Inform ; 43(7): e202400018, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38803302

RESUMEN

The growing interest in chemoinformatic model uncertainty calls for a summary of the most widely used regression techniques and how to estimate their reliability. Regression models learn a mapping from the space of explanatory variables to the space of continuous output values. Among other limitations, the predictive performance of the model is restricted by the training data used for model fitting. Identification of unusual objects by outlier detection methods can improve model performance. Additionally, proper model evaluation necessitates defining the limitations of the model, often called the applicability domain. Comparable to certain classifiers, some regression techniques come with built-in methods or augmentations to quantify their (un)certainty, while others rely on generic procedures. The theoretical background of their working principles and how to deduce specific and general definitions for their domain of applicability shall be explained.

Asunto(s)

Quimioinformática , Quimioinformática/métodos , Análisis de Regresión

19.

Cheminformatics approach to identify andrographolide derivatives as dual inhibitors of methyltransferases (nsp14 and nsp16) of SARS-CoV-2.

Thomas, Jobin; Ghosh, Anupam; Ranjan, Shivendu; Satija, Jitendra.

Sci Rep ; 14(1): 9801, 2024 04 29.

Artículo en Inglés | MEDLINE | ID: mdl-38684706

RESUMEN

The Covid-19 pandemic outbreak has accelerated tremendous efforts to discover a therapeutic strategy that targets severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to control viral infection. Various viral proteins have been identified as potential drug targets, however, to date, no specific therapeutic cure is available against the SARS-CoV-2. To address this issue, the present work reports a systematic cheminformatic approach to identify the potent andrographolide derivatives that can target methyltransferases of SARS-CoV-2, i.e. nsp14 and nsp16 which are crucial for the replication of the virus and host immune evasion. A consensus of cheminformatics methodologies including virtual screening, molecular docking, ADMET profiling, molecular dynamics simulations, free-energy landscape analysis, molecular mechanics generalized born surface area (MM-GBSA), and density functional theory (DFT) was utilized. Our study reveals two new andrographolide derivatives (PubChem CID: 2734589 and 138968421) as natural bioactive molecules that can form stable complexes with both proteins via hydrophobic interactions, hydrogen bonds and electrostatic interactions. The toxicity analysis predicts class four toxicity for both compounds with LD50 value in the range of 500-700 mg/kg. MD simulation reveals the stable formation of the complex for both the compounds and their average trajectory values were found to be lower than the control inhibitor and protein alone. MMGBSA analysis corroborates the MD simulation result and showed the lowest energy for the compounds 2734589 and 138968421. The DFT and MEP analysis also predicts the better reactivity and stability of both the hit compounds. Overall, both andrographolide derivatives exhibit good potential as potent inhibitors for both nsp14 and nsp16 proteins, however, in-vitro and in vivo assessment would be required to prove their efficacy and safety in clinical settings. Moreover, the drug discovery strategy aiming at the dual target approach might serve as a useful model for inventing novel drug molecules for various other diseases.

Asunto(s)

Antivirales , Diterpenos , Metiltransferasas , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , SARS-CoV-2 , Proteínas no Estructurales Virales , Diterpenos/farmacología , Diterpenos/química , SARS-CoV-2/efectos de los fármacos , SARS-CoV-2/enzimología , Metiltransferasas/antagonistas & inhibidores , Metiltransferasas/química , Metiltransferasas/metabolismo , Antivirales/farmacología , Antivirales/química , Humanos , Proteínas no Estructurales Virales/antagonistas & inhibidores , Proteínas no Estructurales Virales/química , Proteínas no Estructurales Virales/metabolismo , Quimioinformática/métodos , COVID-19/virología , Inhibidores Enzimáticos/química , Inhibidores Enzimáticos/farmacología , Tratamiento Farmacológico de COVID-19

20.

Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical MatterâCompounds with Novel Mechanisms from High-Throughput Screening Profiles.

Thomas, Jason R; Shelton, Claude; Murphy, Jason; Brittain, Scott; Bray, Mark-Anthony; Aspesi, Peter; Concannon, John; King, Frederick J; Ihry, Robert J; Ho, Daniel J; Henault, Martin; Hadjikyriacou, Andrea; Neri, Marilisa; Sigoillot, Frederic D; Pham, Helen T; Shum, Matthew; Barys, Louise; Jones, Michael D; Martin, Eric J; Blechschmidt, Anke; Rieffel, Sébastien; Troxler, Thomas J; Mapa, Felipa A; Jenkins, Jeremy L; Jain, Rishi K; Kutchukian, Peter S; Schirle, Markus; Renner, Steffen.

ACS Chem Biol ; 19(4): 938-952, 2024 04 19.

Artículo en Inglés | MEDLINE | ID: mdl-38565185

RESUMEN

Phenotypic assays have become an established approach to drug discovery. Greater disease relevance is often achieved through cellular models with increased complexity and more detailed readouts, such as gene expression or advanced imaging. However, the intricate nature and cost of these assays impose limitations on their screening capacity, often restricting screens to well-characterized small compound sets such as chemogenomics libraries. Here, we outline a cheminformatics approach to identify a small set of compounds with likely novel mechanisms of action (MoAs), expanding the MoA search space for throughput limited phenotypic assays. Our approach is based on mining existing large-scale, phenotypic high-throughput screening (HTS) data. It enables the identification of chemotypes that exhibit selectivity across multiple cell-based assays, which are characterized by persistent and broad structure activity relationships (SAR). We validate the effectiveness of our approach in broad cellular profiling assays (Cell Painting, DRUG-seq, and Promotor Signature Profiling) and chemical proteomics experiments. These experiments revealed that the compounds behave similarly to known chemogenetic libraries, but with a notable bias toward novel protein targets. To foster collaboration and advance research in this area, we have curated a public set of such compounds based on the PubChem BioAssay dataset and made it available for use by the scientific community.

Asunto(s)

Descubrimiento de Drogas , Ensayos Analíticos de Alto Rendimiento , Bibliotecas de Moléculas Pequeñas , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Quimioinformática/métodos , Bibliotecas de Moléculas Pequeñas/química , Relación Estructura-Actividad

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA