The creation and characterisation of a National Compound Collection: the Royal Society of Chemistry pilot.

Andrews, David M; Broad, Laura M; Edwards, Paul J; Fox, David N A; Gallagher, Timothy; Garland, Stephen L; Kidd, Richard; Sweeney, Joseph B

Andrews, David M; Broad, Laura M; Edwards, Paul J; Fox, David N A; Gallagher, Timothy; Garland, Stephen L; Kidd, Richard; Sweeney, Joseph B.

Afiliación

Andrews DM; Royal Society of Chemistry , Thomas Graham House, Science Park, Milton Road , Cambridge , CB4 0WF , UK . Email: david.andrews@astrazeneca.com.
Broad LM; School of Chemistry , University of Bristol , Bristol , BS8 1TS , UK.
Edwards PJ; Scicate Limited , Mendip Court , Bath Road , Wells , Somerset BA5 3DG , UK.
Fox DNA; Royal Society of Chemistry , Thomas Graham House, Science Park, Milton Road , Cambridge , CB4 0WF , UK . Email: david.andrews@astrazeneca.com.
Gallagher T; School of Chemistry , University of Bristol , Bristol , BS8 1TS , UK.
Garland SL; NQuiX Ltd , Causeway House, Dane Street , Bishops Stortford , Hertfordshire CM23 3BT , UK.
Kidd R; Royal Society of Chemistry , Thomas Graham House, Science Park, Milton Road , Cambridge , CB4 0WF , UK . Email: david.andrews@astrazeneca.com.
Sweeney JB; Department of Chemical Sciences , University of Huddersfield , Huddersfield HD1 3DH , UK.

Chem Sci ; 7(6): 3869-3878, 2016 Jun 01.

Article en En | MEDLINE | ID: mdl-30155031

RESUMEN

We present a summary of the National Compound Collection (NCC) pilot; which harvested chemical structure data from 746 publicly-available PhD theses to create an enhanced database of diverse and interesting (largely organic) molecular entities. The database comprised â¼75 000 structure entries, of which 70% were new to ChemSpider at the time of upload. The dataset was evaluated for structural uniqueness by twelve external drug discovery groups from the pharmaceutical, biotech, academic and not-for-profit sectors. These partners generated data reported here comparing the NCC pilot with their in-house compound collections. The proportion of NCC structures considered to be useful for drug discovery ranged from 5-80% depending on the strictness of the filters used; most interestingly from a drug discovery standpoint â¼13k NCC compounds (18% of the NCC) passed the filters and were of good diversity. These compounds are quite different from those that are already present in the screening collections but not so different that they are no longer considered to be drug-like. In general, the drug discovery teams would consider these compounds to be high value molecules for inclusion in their screening collections. This pilot addressed the potential value of unpublished data and explored the practicalities of large-scale data extraction, to inform both retrospective and prospective extraction of chemical data from theses.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Chem Sci Año: 2016 Tipo del documento: Article Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google