Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
Mol Inform ; : e202400154, 2024 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-39105614

RESUMEN

During the early stages of drug design, identifying compounds with suitable bioactivities is crucial. Given the vast array of potential drug databases, it's feasible to assay only a limited subset of candidates. The optimal method for selecting the candidates, aiming to minimize the overall number of assays, involves an active learning (AL) approach. In this work, we benchmarked a range of AL strategies with two main objectives: (1) to identify a strategy that ensures high model performance and (2) to select molecules with desired properties using minimal assays. To evaluate the different AL strategies, we employed the simulated AL workflow based on "virtual" experiments. These experiments leveraged ChEMBL datasets, which come with known biological activity values for the molecules. Furthermore, for classification tasks, we proposed the hybrid selection strategy that unified both exploration and exploitation AL strategies into a single acquisition function, defined by parameters n and c. We have also shown that popular minimal margin and maximal variance selection approaches for exploration selection correspond to minimization of the hybrid acquisition function with n=1 and 2 respectively. The balance between the exploration and exploitation strategies can be adjusted using a coefficient (c), making the optimal strategy selection straightforward. The primary strength of the hybrid selection method lies in its adaptability; it offers the flexibility to adjust the criteria for molecule selection based on the specific task by modifying the value of the contribution coefficient. Our analysis revealed that, in regression tasks, AL strategies didn't succeed at ensuring high model performance, however, they were successful in selecting molecules with desired properties using minimal number of tests. In analogous experiments in classification tasks, exploration strategy and the hybrid selection function with a constant c<1 (for n=1) and c≤0.2 (for n=2) were effective in achieving the goal of constructing a high-performance predictive model using minimal data. When searching for molecules with desired properties, exploitation, and the hybrid function with c≥1 (n=1) and c≥0.7 (n=2) demonstrated efficiency identifying molecules in fewer iterations compared to random selection method. Notably, when the hybrid function was set to an intermediate coefficient value (c=0.7), it successfully addressed both tasks simultaneously.

2.
J Cheminform ; 16(1): 30, 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38481269

RESUMEN

Membrane permeability is an in vitro parameter that represents the apparent permeability (Papp) of a compound, and is a key absorption, distribution, metabolism, and excretion parameter in drug development. Although the Caco-2 cell lines are the most used cell lines to measure Papp, other cell lines, such as the Madin-Darby Canine Kidney (MDCK), LLC-Pig Kidney 1 (LLC-PK1), and Ralph Russ Canine Kidney (RRCK) cell lines, can also be used to estimate Papp. Therefore, constructing in silico models for Papp estimation using the MDCK, LLC-PK1, and RRCK cell lines requires collecting extensive amounts of in vitro Papp data. An open database offers extensive measurements of various compounds covering a vast chemical space; however, concerns were reported on the use of data published in open databases without the appropriate accuracy and quality checks. Ensuring the quality of datasets for training in silico models is critical because artificial intelligence (AI, including deep learning) was used to develop models to predict various pharmacokinetic properties, and data quality affects the performance of these models. Hence, careful curation of the collected data is imperative. Herein, we developed a new workflow that supports automatic curation of Papp data measured in the MDCK, LLC-PK1, and RRCK cell lines collected from ChEMBL using KNIME. The workflow consisted of four main phases. Data were extracted from ChEMBL and filtered to identify the target protocols. A total of 1661 high-quality entries were retained after checking 436 articles. The workflow is freely available, can be updated, and has high reusability. Our study provides a novel approach for data quality analysis and accelerates the development of helpful in silico models for effective drug discovery. Scientific Contribution: The cost of building highly accurate predictive models can be significantly reduced by automating the collection of reliable measurement data. Our tool reduces the time and effort required for data collection and will enable researchers to focus on constructing high-performance in silico models for other types of analysis. To the best of our knowledge, no such tool is available in the literature.

3.
J Biotechnol Biomed ; 6(4): 501-513, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38050632

RESUMEN

Receptor for Advanced Glycation End products (RAGE) is a transmembrane receptor that can bind to various endogenous and exogenous ligands and initiate the inflammatory downstream signaling pathways. So far RAGE has been involved in various disorders including cardiovascular and neurodegenerative diseases, cancer, and diabetes. Blocking the interactions between RAGE and its ligands is a therapeutic approach to treat these conditions. In this context, we effectively utilized the receptor-based-pharmacophore modeling to discover structurally diverse molecular compounds having potential to effectively bind with RAGE. Two pharmacophore models were developed on V-domain of RAGE using Phase application of Schrodinger suite. The developed pharmacophoric features were used for screening of 1.8 million drug-like molecules downloaded from ChEMBL database. The molecules were scrutinized according to their molecular weight as well as clogP values. Phase screening was performed to find out the molecules that matched the developed pharmacophoric features that were further selected to analyze their binding modes using high-throughput virtual screening, extra precision docking studies and MM-GBSA ΔG binding calculations. These analyses provided ten hit RAGE inhibitory molecules that can bind to two different shallow binding sites on the V-domain of RAGE. Among the obtained compounds two compounds ChEMBL501494 and ChEMBL4081874 were found with best binding free energies that proved their receptor-ligand complex stability within their respective binding cavity on RAGE. Therefore, these molecules could be utilized for further designing and optimizing the future class of potential RAGE inhibitors.

4.
J Cheminform ; 15(1): 82, 2023 Sep 19.
Artículo en Inglés | MEDLINE | ID: mdl-37726809

RESUMEN

We report the major highlights of the School of Cheminformatics in Latin America, Mexico City, November 24-25, 2022. Six lectures, one workshop, and one roundtable with four editors were presented during an online public event with speakers from academia, big pharma, and public research institutions. One thousand one hundred eighty-one students and academics from seventy-nine countries registered for the meeting. As part of the meeting, advances in enumeration and visualization of chemical space, applications in natural product-based drug discovery, drug discovery for neglected diseases, toxicity prediction, and general guidelines for data analysis were discussed. Experts from ChEMBL presented a workshop on how to use the resources of this major compounds database used in cheminformatics. The school also included a round table with editors of cheminformatics journals. The full program of the meeting and the recordings of the sessions are publicly available at https://www.youtube.com/@SchoolChemInfLA/featured .

5.
Methods Mol Biol ; 2706: 25-50, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37558939

RESUMEN

Public repositories containing compound-bioactivity data for millions of small molecules offer a valuable resource for chemogenomic compound candidate search. Nonetheless, owning to nonuniform data mining, these databases are often incomplete, thus advocating the combined use of data from several repositories to increase target coverage and data accuracy. Here, we present a workflow to generate custom datasets from public databases for mining chemogenomic compound candidates. The compiled set provides flags for differences in structural and bioactivity data and enables rapid extraction of potent and selective bioactive compounds.


Asunto(s)
Exactitud de los Datos , Minería de Datos , Bases de Datos Factuales
6.
Pharmaceutics ; 15(5)2023 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-37242601

RESUMEN

Schistosomiasis is one of the most important neglected tropical diseases. Until an effective vaccine is registered for use, the cornerstone of schistosomiasis control remains chemotherapy with praziquantel. The sustainability of this strategy is at substantial risk due to the possibility of praziquantel insensitive/resistant schistosomes developing. Considerable time and effort could be saved in the schistosome drug discovery pipeline if available functional genomics, bioinformatics, cheminformatics and phenotypic resources are systematically leveraged. Our approach, described here, outlines how schistosome-specific resources/methodologies, coupled to the open-access drug discovery database ChEMBL, can be cooperatively used to accelerate early-stage, schistosome drug discovery efforts. Our process identified seven compounds (fimepinostat, trichostatin A, NVP-BEP800, luminespib, epoxomicin, CGP60474 and staurosporine) with ex vivo anti-schistosomula potencies in the sub-micromolar range. Three of those compounds (epoxomicin, CGP60474 and staurosporine) also demonstrated potent and fast-acting ex vivo effects on adult schistosomes and completely inhibited egg production. ChEMBL toxicity data were also leveraged to provide further support for progressing CGP60474 (as well as luminespib and TAE684) as a novel anti-schistosomal compound. As very few compounds are currently at the advanced stages of the anti-schistosomal pipeline, our approaches highlight a strategy by which new chemical matter can be identified and quickly progressed through preclinical development.

7.
J Pers Med ; 13(4)2023 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-37109046

RESUMEN

Diabetes is a chronic hyperglycemic disorder that leads to a group of metabolic diseases. This condition of chronic hyperglycemia is caused by abnormal insulin levels. The impact of hyperglycemia on the human vascular tree is the leading cause of disease and death in type 1 and type 2 diabetes. People with type 2 diabetes mellitus (T2DM) have abnormal secretion as well as the action of insulin. Type 2 (non-insulin-dependent) diabetes is caused by a combination of genetic factors associated with decreased insulin production, insulin resistance, and environmental conditions. These conditions include overeating, lack of exercise, obesity, and aging. Glucose transport limits the rate of dietary glucose used by fat and muscle. The glucose transporter GLUT4 is kept intracellular and sorted dynamically, and GLUT4 translocation or insulin-regulated vesicular traffic distributes it to the plasma membrane. Different chemical compounds have antidiabetic properties. The complexity, metabolism, digestion, and interaction of these chemical compounds make it difficult to understand and apply them to reduce chronic inflammation and thus prevent chronic disease. In this study, we have applied a virtual screening approach to screen the most suitable and drug-able chemical compounds to be used as potential drug targets against T2DM. We have found that out of 5000 chemical compounds that we have analyzed, only two are known to be more effective as per our experiments based upon molecular docking studies and virtual screening through Lipinski's rule and ADMET properties.

8.
Biomedicines ; 11(1)2023 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-36672680

RESUMEN

Small molecules are being used to inhibit cyclin dependent kinase (CDK) enzymes in cancer treatment. There is evidence that CDK is a drug-target for cancer therapy across many tumor types because it catalyzes the transfer of the terminal phosphate of ATP to a protein that acts as a substrate. Herein, the identification of pyranopyrazoles that were CDK inhibitors was attempted, whose synthesis was catalyzed by nano-zirconium dioxide via multicomponent reaction. Additionally, we performed an in-situ analysis of the intermediates of multicomponent reactions, for the first-time, which revealed that nano-zirconium dioxide stimulated the reaction, as estimated by Gibbs free energy calculations of spontaneity. Functionally, the novel pyranopyrazoles were tested for a loss of cell viability using human breast cancer cells (MCF-7). It was observed that compounds 5b and 5f effectively produced loss of viability of MCF-7 cells with IC50 values of 17.83 and 23.79 µM, respectively. In vitro and in silico mode-of-action studies showed that pyranopyrazoles target CDK1 in human breast cancer cells, with lead compounds 5b and 5f having potent IC50 values of 960 nM and 7.16 µM, respectively. Hence, the newly synthesized bioactive pyranopyrazoles could serve as better structures to develop CDK1 inhibitors against human breast cancer cells.

9.
Molecules ; 28(2)2023 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-36677547

RESUMEN

Currently, G protein-coupled receptors (GPCRs) constitute a significant group of membrane-bound receptors representing more than 30% of therapeutic targets. Fluorine is commonly used in designing highly active biological compounds, as evidenced by the steadily increasing number of drugs by the Food and Drug Administration (FDA). Herein, we identified and analyzed 898 target-based F-containing isomeric analog sets for SAR analysis in the ChEMBL database-FiSAR sets active against 33 different aminergic GPCRs comprising a total of 2163 fluorinated (1201 unique) compounds. We found 30 FiSAR sets contain activity cliffs (ACs), defined as pairs of structurally similar compounds showing significant differences in affinity (≥50-fold change), where the change of fluorine position may lead up to a 1300-fold change in potency. The analysis of matched molecular pair (MMP) networks indicated that the fluorination of aromatic rings showed no clear trend toward a positive or negative effect on affinity. Additionally, we propose an in silico workflow (including induced-fit docking, molecular dynamics, quantum polarized ligand docking, and binding free energy calculations based on the Generalized-Born Surface-Area (GBSA) model) to score the fluorine positions in the molecule.


Asunto(s)
Flúor , Simulación de Dinámica Molecular , Flúor/química , Unión Proteica , Receptores Acoplados a Proteínas G/química , Isomerismo , Ligandos , Simulación del Acoplamiento Molecular
10.
Mol Inform ; 42(4): e2200208, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36604304

RESUMEN

In order to analyze the Chimiothèque Nationale (CN) - The French National Compound Library - in the context of screening and biologically relevant compounds, the library was compared with ZINC in-stock collection and ChEMBL. This includes the study of chemical space coverage, physicochemical properties and Bemis-Murcko (BM) scaffold populations. More than 5 K CN-unique scaffolds (relative to ZINC and ChEMBL collections) were identified. Generative Topographic Maps (GTMs) accommodating those libraries were generated and used to compare the compound populations. Hierarchical GTM («zooming¼) was applied to generate an ensemble of maps at various resolution levels, from global overview to precise mapping of individual structures. The respective maps were added to the ChemSpace Atlas website. The analysis of synthetic accessibility in the context of combinatorial chemistry showed that only 29,7 % of CN compounds can be fully synthesized using commercially available building blocks.


Asunto(s)
Bases de Datos de Compuestos Químicos
11.
Mol Pharm ; 19(7): 2151-2163, 2022 07 04.
Artículo en Inglés | MEDLINE | ID: mdl-35671399

RESUMEN

Antibacterial drugs (AD) change the metabolic status of bacteria, contributing to bacterial death. However, antibiotic resistance and the emergence of multidrug-resistant bacteria increase interest in understanding metabolic network (MN) mutations and the interaction of AD vs MN. In this study, we employed the IFPTML = Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) algorithm on a huge dataset from the ChEMBL database, which contains >155,000 AD assays vs >40 MNs of multiple bacteria species. We built a linear discriminant analysis (LDA) and 17 ML models centered on the linear index and based on atoms to predict antibacterial compounds. The IFPTML-LDA model presented the following results for the training subset: specificity (Sp) = 76% out of 70,000 cases, sensitivity (Sn) = 70%, and Accuracy (Acc) = 73%. The same model also presented the following results for the validation subsets: Sp = 76%, Sn = 70%, and Acc = 73.1%. Among the IFPTML nonlinear models, the k nearest neighbors (KNN) showed the best results with Sn = 99.2%, Sp = 95.5%, Acc = 97.4%, and Area Under Receiver Operating Characteristic (AUROC) = 0.998 in training sets. In the validation series, the Random Forest had the best results: Sn = 93.96% and Sp = 87.02% (AUROC = 0.945). The IFPTML linear and nonlinear models regarding the ADs vs MNs have good statistical parameters, and they could contribute toward finding new metabolic mutations in antibiotic resistance and reducing time/costs in antibacterial drug research.


Asunto(s)
Antibacterianos , Aprendizaje Automático , Algoritmos , Antibacterianos/farmacología , Bases de Datos Factuales , Redes y Vías Metabólicas
12.
J Comput Aided Mol Des ; 36(3): 253-262, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35359246

RESUMEN

In drug discovery, partition and distribution coefficients, logP and logD for octanol/water, are widely used as metrics of the lipophilicity of molecules, which in turn have a strong influence on the bioactivity and bioavailability of potential drugs. There are a variety of established methods, mostly fragment or atom-based, to calculate logP while logD prediction generally relies on calculated logP and pKa for the estimation of neutral and ionized populations at a given pH. Algorithms such as ClogP have limitations generally leading to systematic errors for chemically related molecules while pKa estimation is generally more difficult due to the interplay of electronic, inductive and conjugation effects for ionizable moieties. We propose an integrated machine learning QSAR modeling approach to predict logD by training the model with experimental data while using ClogP and pKa predicted by commercial software as model descriptors. By optimizing the loss function for the ClogD calculated by the software, we build a correction model that incorporates both descriptors from the software and available experimental logD data. Additionally, we calculate logP from the logD model using the software predicted pKa's. Here, we have trained models using publicly or commercial available logD data to show that this approach can improve on commercial software predictions of lipophilicity. When applied to other logD data sets, this approach extends the domain of applicability of logD and logP predictions over commercial software. Performance of these models favorably compare with models built with a larger set of proprietary logD data.


Asunto(s)
Programas Informáticos , Agua , Algoritmos , Aprendizaje Automático , Octanoles/química , Agua/química
13.
J Biomol Struct Dyn ; 40(24): 13366-13377, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34637693

RESUMEN

The RNA-dependent RNA polymerase (RdRp) is one of the crucial enzymes in severe acute respiratory syndrome Coronavirus-2 (SARS-CoV-2) catalysing the replication of RNA, therefore acts as a potential target for antiviral drug design. The fixation of a ligand in the active site of RdRp may alter the SARS-CoV-2 life cycle. Present work aimed at identifying novel inhibitors of the SARS-CoV-2 RdRp enzyme by performing pharmacophore-based virtual screening, molecular docking and molecular dynamics simulation (MDS). Initially, the pharmacophore model of SARS-CoV-2 RdRp was constructed and the resulting model was used to screen compounds from ChEMBL, ZINC and PubChem databases. During the investigation, 180 compounds were screened using the above model and subjected to molecular docking with RdRp. Two compounds viz. ChEMBL1276156 and PubChem135548348 showed a strong binding affinity with RdRp than its standard inhibitor, Remdesivir. Toxicity prediction of these two compounds reveals their non-toxic nature. These compounds were further subjected to MDS for 100 ns to check their stability after binding with RdRp. The MDS of RdRp-ChEMBL1276156 and RdRp-PubChem135548348 complexes show enhanced stability in comparison to the RdRp-Remdesivir complex. The average interaction energy calculated after 100 ns of MDS was -146.56 and -172.68 KJ mol-1 for RdRp-CHEMBL1276156 complex and RdRp-PubChem135548348 complex, respectively, while -59.90 KJ mol-1 for RdRp-Remdesivir complex. The current investigation reveals that these two compounds are potent inhibitors of SARS-CoV-2 RdRp and they could be tested in the experimental condition to evaluate their efficacy against SARS-CoV-2.Communicated by Ramaswamy H. Sarma.


Asunto(s)
COVID-19 , Humanos , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Farmacóforo , ARN Viral , SARS-CoV-2 , ARN Polimerasa Dependiente del ARN , Antivirales/farmacología
14.
Mol Inform ; 41(5): e2100106, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-34878229

RESUMEN

Turbo Similarity Searching (TSS) is the simplest and most recent chemical similarity searching (SS) approach, which improves the effectiveness of SS by performing a multi-target searching. TSS has four important elements, namely structural representation, similarity coefficient, number of nearest neighbours (NNs), and fusion rule, and any changes in these elements could affect the TSS results. A previous study suggested the advantage of using large numbers of reference compounds with small fractions of the database structures to obtain a better recall in group fusion. Therefore, this study aims to investigate the effect of partial ranking on TSS utilising different fusion rules and different numbers of NNs on the ChEMBL database and to evaluate whether these observations hold in TSS. Furthermore, the objective is to observe the effect of the indirect relationship feature of TSS on the partial ranking investigation. The results showed that the effect of using partial ranking on TSS was significant. This study also found that the performance of TSS improved as the database proportions used in the fusion process decreased and by using a small number of NNs. In addition, fusion rules based on reciprocal rank positions (RKP), maximum similarity score (sMAX), and sMNZ were superior to all the other fusion rules.


Asunto(s)
Bases de Datos Factuales
15.
Methods Mol Biol ; 2390: 153-176, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34731468

RESUMEN

Artificial intelligence (AI) tools find increasing application in drug discovery supporting every stage of the Design-Make-Test-Analyse (DMTA) cycle. The main focus of this chapter is the application in molecular generation with the aid of deep neural networks (DNN). We present a historical overview of the main advances in the field. We analyze the concepts of distribution and goal-directed learning and then highlight some of the recent applications of generative models in drug design with a focus into research work from the biopharmaceutical industry. We present in some more detail REINVENT which is an open-source software developed within our group in AstraZeneca and the main platform for AI molecular design support for a number of medicinal chemistry projects in the company and we also demonstrate some of our work in library design. Finally, we present some of the main challenges in the application of AI in Drug Discovery and different approaches to respond to these challenges which define areas for current and future work.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Diseño de Fármacos , Redes Neurales de la Computación
16.
Front Mol Biosci ; 8: 758480, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34938773

RESUMEN

Given the abundant computational resources and the huge amount of data of compound-protein interactions (CPIs), constructing appropriate datasets for learning and evaluating prediction models for CPIs is not always easy. For this study, we have developed a web server to facilitate the development and evaluation of prediction models by providing an appropriate dataset according to the task. Our web server provides an environment and dataset that aid model developers and evaluators in obtaining a suitable dataset for both proteins and compounds, in addition to attributes necessary for deep learning. With the web server interface, users can customize the CPI dataset derived from ChEMBL by setting positive and negative thresholds to be adjusted according to the user's definitions. We have also implemented a function for graphic display of the distribution of activity values in the dataset as a histogram to set appropriate thresholds for positive and negative examples. These functions enable effective development and evaluation of models. Furthermore, users can prepare their task-specific datasets by selecting a set of target proteins based on various criteria such as Pfam families, ChEMBL's classification, and sequence similarities. The accuracy and efficiency of in silico screening and drug design using machine learning including deep learning can therefore be improved by facilitating access to an appropriate dataset prepared using our web server (https://binds.lifematics.work/).

17.
Int J Mol Sci ; 22(23)2021 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-34884870

RESUMEN

The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information-Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (cassayj = caj and cdataj = cdj) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (caj) or about the nature and quality of data (cdj). These categorical variables include information about 22 parameters of biological activity (ca0), 28 target proteins (ca1), and 9 organisms of assay (ca2), etc. We also created another partition of (cprotj = cpj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (cp0), 10 chromosomes (cp1), gene orientation (cp2), and 31 protein functions (cp3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon's entropy measure Shk (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium.


Asunto(s)
Antimaláricos/farmacología , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Plasmodium falciparum/genética , Algoritmos , Antimaláricos/química , Bases de Datos Farmacéuticas , Evaluación Preclínica de Medicamentos , Genoma de Protozoos , Cadenas de Markov , Modelos Teóricos , Proteínas Protozoarias/química , Proteínas Protozoarias/genética , Proteínas Protozoarias/metabolismo , Reproducibilidad de los Resultados
18.
Biomolecules ; 11(11)2021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34827645

RESUMEN

Currently, G protein-coupled receptors are the targets with the highest number of drugs in many therapeutic areas. Fluorination has become a common strategy in designing highly active biological compounds, as evidenced by the steadily increasing number of newly approved fluorine-containing drugs. Herein, we identified in the ChEMBL database and analysed 1554 target-based FSAR sets (non-fluorinated compounds and their fluorinated analogues) comprising 966 unique non-fluorinated and 2457 unique fluorinated compounds active against 33 different aminergic GPCRs. Although a relatively small number of activity cliffs (defined as a pair of structurally similar compounds showing significant differences of activity -ΔpPot > 1.7) was found in FSAR sets, it is clear that appropriately introduced fluorine can increase ligand potency more than 50-fold. The analysis of matched molecular pairs (MMPs) networks indicated that the fluorination of the aromatic ring showed no clear trend towards a positive or negative effect on affinity; however, a favourable site for a positive potency effect of fluorination was the ortho position. Fluorination of aliphatic fragments more often led to a decrease in biological activity. The results may constitute the rules of thumb for fluorination of aminergic receptor ligands and provide insights into the role of fluorine substitutions in medicinal chemistry.


Asunto(s)
Receptores Acoplados a Proteínas G , Halogenación , Unión Proteica
19.
Int J Mol Sci ; 22(21)2021 Oct 26.
Artículo en Inglés | MEDLINE | ID: mdl-34768951

RESUMEN

The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML models use the perturbations of molecular descriptors of drugs and nanoparticles as inputs in experimental conditions. The raw dataset was obtained by mixing the nanoparticle experimental data with drug assays from the ChEMBL database. Ten types of machine learning methods have been tested. Only 41 features have been selected for 855,129 drug-nanoparticle complexes. The best model was obtained with the Bagging classifier, an ensemble meta-estimator based on 20 decision trees, with an area under the receiver operating characteristic curve (AUROC) of 0.96, and an accuracy of 87% (test subset). This model could be useful for the virtual screening of nanoparticle-drug complexes in glioblastoma. All the calculations can be reproduced with the datasets and python scripts, which are freely available as a GitHub repository from authors.


Asunto(s)
Antineoplásicos/administración & dosificación , Neoplasias Encefálicas/tratamiento farmacológico , Sistemas de Liberación de Medicamentos , Glioblastoma/tratamiento farmacológico , Aprendizaje Automático , Nanopartículas , Bases de Datos de Compuestos Químicos , Bases de Datos Farmacéuticas , Portadores de Fármacos/administración & dosificación , Diseño de Fármacos , Ensayos de Selección de Medicamentos Antitumorales , Humanos , Nanopartículas/administración & dosificación , Interfaz Usuario-Computador
20.
Molecules ; 26(17)2021 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-34500686

RESUMEN

A method is presented to analyze quantitatively the degree of congenericity of claimed compounds in patent applications. The approach successfully differentiates patents exemplified with highly congeneric compounds of a structurally compact and well defined chemical series from patents containing a more diverse set of compounds around a more vaguely described patent claim. An application to 750 common patents available in SureChEMBL, SureChEMBLccs and ChEMBL is presented and the congenericity of patent compounds in those different sources discussed.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA