Búsqueda | Portal Regional de la BVS

1.

Predicting Off-Target Binding Profiles With Confidence Using Conformal Prediction.

Lampa, Samuel; Alvarsson, Jonathan; Arvidsson Mc Shane, Staffan; Berg, Arvid; Ahlberg, Ernst; Spjuth, Ola.

Front Pharmacol ; 9: 1256, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30459617

RESUMEN

Ligand-based models can be used in drug discovery to obtain an early indication of potential off-target interactions that could be linked to adverse effects. Another application is to combine such models into a panel, allowing to compare and search for compounds with similar profiles. Most contemporary methods and implementations however lack valid measures of confidence in their predictions, and only provide point predictions. We here describe a methodology that uses Conformal Prediction for predicting off-target interactions, with models trained on data from 31 targets in the ExCAPE-DB dataset selected for their utility in broad early hazard assessment. Chemicals were represented by the signature molecular descriptor and support vector machines were used as the underlying machine learning method. By using conformal prediction, the results from predictions come in the form of confidence p-values for each class. The full pre-processing and model training process is openly available as scientific workflows on GitHub, rendering it fully reproducible. We illustrate the usefulness of the developed methodology on a set of compounds extracted from DrugBank. The resulting models are published online and are available via a graphical web interface and an OpenAPI interface for programmatic access.

2.

A confidence predictor for logD using conformal regression and a support-vector machine.

Lapins, Maris; Arvidsson, Staffan; Lampa, Samuel; Berg, Arvid; Schaal, Wesley; Alvarsson, Jonathan; Spjuth, Ola.

J Cheminform ; 10(1): 17, 2018 Apr 03.

Artículo en Inglés | MEDLINE | ID: mdl-29616425

RESUMEN

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

3.

The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching.

Willighagen, Egon L; Mayfield, John W; Alvarsson, Jonathan; Berg, Arvid; Carlsson, Lars; Jeliazkova, Nina; Kuhn, Stefan; Pluskal, Tomás; Rojas-Chertó, Miquel; Spjuth, Ola; Torrance, Gilleain; Evelo, Chris T; Guha, Rajarshi; Steinbeck, Christoph.

J Cheminform ; 9(1): 33, 2017 Jun 06.

Artículo en Inglés | MEDLINE | ID: mdl-29086040

RESUMEN

BACKGROUND: The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. RESULTS: We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. CONCLUSIONS: This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance.

4.

Erratum to: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching.

Willighagen, Egon L; Mayfield, John W; Alvarsson, Jonathan; Berg, Arvid; Carlsson, Lars; Jeliazkova, Nina; Kuhn, Stefan; Pluskal, Tomás; Rojas-Chertó, Miquel; Spjuth, Ola; Torrance, Gilleain; Evelo, Chris T; Guha, Rajarshi; Steinbeck, Christoph.

J Cheminform ; 9(1): 53, 2017 Sep 20.

Artículo en Inglés | MEDLINE | ID: mdl-29086079

5.

Applications of the InChI in cheminformatics with the CDK and Bioclipse.

Spjuth, Ola; Berg, Arvid; Adams, Samuel; Willighagen, Egon L.

J Cheminform ; 5(1): 14, 2013 Mar 13.

Artículo en Inglés | MEDLINE | ID: mdl-23497723

RESUMEN

BACKGROUND: The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. RESULTS: We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To make this possible, a JNI bridge to the InChI library was developed, JNI-InChI, allowing Java software to access the InChI algorithms. By using this bridge, the CDK project packages the InChI binaries in a module and offers easy access from Java using the CDK API. The Bioclipse project packages and offers InChI as a dynamic OSGi bundle that can easily be used by any OSGi-compliant software, in addition to the regular Java Archive and Maven bundles. Bioclipse itself uses the InChI as a key component and calculates it on the fly when visualizing and editing chemical structures. We demonstrate the utility of InChI with various applications in CDK and Bioclipse, such as decision support for chemical liability assessment, tautomer generation, and for knowledge aggregation using a linked data approach. CONCLUSIONS: These results show that the InChI library can be used in a variety of Java library dependency solutions, making the functionality easily accessible by Java software, such as in the CDK. The applications show various ways the InChI has been used in Bioclipse, to enrich its functionality.

6.

Bioclipse-R: integrating management and visualization of life science data with statistical analysis.

Spjuth, Ola; Georgiev, Valentin; Carlsson, Lars; Alvarsson, Jonathan; Berg, Arvid; Willighagen, Egon; Wikberg, Jarl E S; Eklund, Martin.

Bioinformatics ; 29(2): 286-9, 2013 Jan 15.

Artículo en Inglés | MEDLINE | ID: mdl-23178637

RESUMEN

SUMMARY: Bioclipse, a graphical workbench for the life sciences, provides functionality for managing and visualizing life science data. We introduce Bioclipse-R, which integrates Bioclipse and the statistical programming language R. The synergy between Bioclipse and R is demonstrated by the construction of a decision support system for anticancer drug screening and mutagenicity prediction, which shows how Bioclipse-R can be used to perform complex tasks from within a single software system. AVAILABILITY AND IMPLEMENTATION: Bioclipse-R is implemented as a set of Java plug-ins for Bioclipse based on the R-package rj. Source code and binary packages are available from https://github.com/bioclipse and http://www.bioclipse.net/bioclipse-r, respectively. CONTACT: martin.eklund@farmbio.uu.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Disciplinas de las Ciencias Biológicas , Gráficos por Computador , Programas Informáticos , Antineoplásicos/química , Antineoplásicos/farmacología , Antineoplásicos/toxicidad , Interpretación Estadística de Datos , Mutagénesis , Lenguajes de Programación , Relación Estructura-Actividad Cuantitativa , Integración de Sistemas

7.

Bioclipse 2: a scriptable integration platform for the life sciences.

Spjuth, Ola; Alvarsson, Jonathan; Berg, Arvid; Eklund, Martin; Kuhn, Stefan; Mäsak, Carl; Torrance, Gilleain; Wagener, Johannes; Willighagen, Egon L; Steinbeck, Christoph; Wikberg, Jarl E S.

BMC Bioinformatics ; 10: 397, 2009 Dec 03.

Artículo en Inglés | MEDLINE | ID: mdl-19958528

RESUMEN

BACKGROUND: Contemporary biological research integrates neighboring scientific domains to answer complex questions in fields such as systems biology and drug discovery. This calls for tools that are intuitive to use, yet flexible to adapt to new tasks. RESULTS: Bioclipse is a free, open source workbench with advanced features for the life sciences. Version 2.0 constitutes a complete rewrite of Bioclipse, and delivers a stable, scalable integration platform for developers and an intuitive workbench for end users. All functionality is available both from the graphical user interface and from a built-in novel domain-specific language, supporting the scientist in interdisciplinary research and reproducible analyses through advanced visualization of the inputs and the results. New components for Bioclipse 2 include a rewritten editor for chemical structures, a table for multiple molecules that supports gigabyte-sized files, as well as a graphical editor for sequences and alignments. CONCLUSION: Bioclipse 2 is equipped with advanced tools required to carry out complex analysis in the fields of bio- and cheminformatics. Developed as a Rich Client based on Eclipse, Bioclipse 2 leverages on today's powerful desktop computers for providing a responsive user interface, but also takes full advantage of the Web and networked (Web/Cloud) services for more demanding calculations or retrieval of data. The fact that Bioclipse 2 is based on an advanced and widely used service platform ensures wide extensibility, making it easy to add new algorithms, visualizations, as well as scripting commands. The intuitive tools for end users and the extensible architecture make Bioclipse 2 ideal for interdisciplinary and integrative research.Bioclipse 2 is released under the Eclipse Public License (EPL), a flexible open source license that allows additional plugins to be of any license. Bioclipse 2 is implemented in Java and supported on all major platforms; Source code and binaries are freely available at http://www.bioclipse.net.

Asunto(s)

Biología Computacional/métodos , Programas Informáticos , Algoritmos , Disciplinas de las Ciencias Biológicas , Bases de Datos Factuales

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA