Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
ACS Chem Biol ; 10(8): 1939-51, 2015 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-26056718

RESUMEN

Mammalian central nervous system (CNS) neurons regrow their axons poorly following injury, resulting in irreversible functional losses. Identifying therapeutics that encourage CNS axon repair has been difficult, in part because multiple etiologies underlie this regenerative failure. This suggests a particular need for drugs that engage multiple molecular targets. Although multitarget drugs are generally more effective than highly selective alternatives, we lack systematic methods for discovering such drugs. Target-based screening is an efficient technique for identifying potent modulators of individual targets. In contrast, phenotypic screening can identify drugs with multiple targets; however, these targets remain unknown. To address this gap, we combined the two drug discovery approaches using machine learning and information theory. We screened compounds in a phenotypic assay with primary CNS neurons and also in a panel of kinase enzyme assays. We used learning algorithms to relate the compounds' kinase inhibition profiles to their influence on neurite outgrowth. This allowed us to identify kinases that may serve as targets for promoting neurite outgrowth as well as others whose targeting should be avoided. We found that compounds that inhibit multiple targets (polypharmacology) promote robust neurite outgrowth in vitro. One compound with exemplary polypharmacology was found to promote axon growth in a rodent spinal cord injury model. A more general applicability of our approach is suggested by its ability to deconvolve known targets for a breast cancer cell line as well as targets recently shown to mediate drug resistance.


Asunto(s)
Descubrimiento de Drogas/métodos , Regeneración Nerviosa/efectos de los fármacos , Neuritas/efectos de los fármacos , Neuronas/efectos de los fármacos , Inhibidores de Proteínas Quinasas/farmacología , Animales , Células Cultivadas , Sistema Nervioso Central/citología , Sistema Nervioso Central/efectos de los fármacos , Sistema Nervioso Central/fisiología , Humanos , Aprendizaje Automático , Neuritas/fisiología , Neuronas/fisiología , Polifarmacología , Proteínas Quinasas/genética , Proteínas Quinasas/metabolismo , ARN Interferente Pequeño/genética , Ratas
2.
BMC Cancer ; 14: 584, 2014 Aug 11.
Artículo en Inglés | MEDLINE | ID: mdl-25112586

RESUMEN

BACKGROUND: Increasing focus on potentially unnecessary diagnosis and treatment of certain breast cancers prompted our investigation of whether clinical and mammographic features predictive of invasive breast cancer versus ductal carcinoma in situ (DCIS) differ by age. METHODS: We analyzed 1,475 malignant breast biopsies, 1,063 invasive and 412 DCIS, from 35,871 prospectively collected consecutive diagnostic mammograms interpreted at University of California, San Francisco between 1/6/1997 and 6/29/2007. We constructed three logistic regression models to predict the probability of invasive cancer versus DCIS for the following groups: women ≥ 65 (older group), women 50-64 (middle age group), and women < 50 (younger group). We identified significant predictors and measured the performance in all models using area under the receiver operating characteristic curve (AUC). RESULTS: The models for older and the middle age groups performed significantly better than the model for younger group (AUC = 0.848 vs, 0.778; p = 0.049 and AUC = 0.851 vs, 0.778; p = 0.022, respectively). Palpability and principal mammographic finding were significant predictors in distinguishing invasive from DCIS in all age groups. Family history of breast cancer, mass shape and mass margins were significant positive predictors of invasive cancer in the older group whereas calcification distribution was a negative predictor of invasive cancer (i.e. predicted DCIS). In the middle age group--mass margins, and in the younger group--mass size were positive predictors of invasive cancer. CONCLUSIONS: Clinical and mammographic finding features predict invasive breast cancer versus DCIS better in older women than younger women. Specific predictive variables differ based on age.


Asunto(s)
Neoplasias de la Mama/patología , Carcinoma Ductal de Mama/patología , Carcinoma Intraductal no Infiltrante/patología , Adulto , Factores de Edad , Anciano , Neoplasias de la Mama/epidemiología , Carcinoma Ductal de Mama/epidemiología , Carcinoma Intraductal no Infiltrante/epidemiología , Femenino , Humanos , Modelos Logísticos , Mamografía , Persona de Mediana Edad , Factores de Riesgo
3.
Artículo en Inglés | MEDLINE | ID: mdl-26158123

RESUMEN

Machine learning is continually being applied to a growing set of fields, including the social sciences, business, and medicine. Some fields present problems that are not easily addressed using standard machine learning approaches and, in particular, there is growing interest in differential prediction. In this type of task we are interested in producing a classifier that specifically characterizes a subgroup of interest by maximizing the difference in predictive performance for some outcome between subgroups in a population. We discuss adapting maximum margin classifiers for differential prediction. We first introduce multiple approaches that do not affect the key properties of maximum margin classifiers, but which also do not directly attempt to optimize a standard measure of differential prediction. We next propose a model that directly optimizes a standard measure in this field, the uplift measure. We evaluate our models on real data from two medical applications and show excellent results.

4.
AMIA Annu Symp Proc ; 2013: 876-85, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24551380

RESUMEN

Several recent genome-wide association studies have identified genetic variants associated with breast cancer. However, how much these genetic variants may help advance breast cancer risk prediction based on other clinical features, like mammographic findings, is unknown. We conducted a retrospective case-control study, collecting mammographic findings and high-frequency/low-penetrance genetic variants from an existing personalized medicine data repository. A Bayesian network was developed using Tree Augmented Naive Bayes (TAN) by training on the mammographic findings, with and without the 22 genetic variants collected. We analyzed the predictive performance using the area under the ROC curve, and found that the genetic variants significantly improved breast cancer risk prediction on mammograms. We also identified the interaction effect between the genetic variants and collected mammographic findings in an attempt to link genotype to mammographic phenotype to better understand disease patterns, mechanisms, and/or natural history.


Asunto(s)
Teorema de Bayes , Neoplasias de la Mama/genética , Mamografía , Medición de Riesgo/métodos , Neoplasias de la Mama/diagnóstico por imagen , Estudios de Casos y Controles , Femenino , Predisposición Genética a la Enfermedad , Genotipo , Humanos , Redes Neurales de la Computación , Polimorfismo de Nucleótido Simple , Curva ROC
5.
Artículo en Inglés | MEDLINE | ID: mdl-26158122

RESUMEN

We introduce Score As You Lift (SAYL), a novel Statistical Relational Learning (SRL) algorithm, and apply it to an important task in the diagnosis of breast cancer. SAYL combines SRL with the marketing concept of uplift modeling, uses the area under the uplift curve to direct clause construction and final theory evaluation, integrates rule learning and probability assignment, and conditions the addition of each new theory rule to existing ones. Breast cancer, the most common type of cancer among women, is categorized into two subtypes: an earlier in situ stage where cancer cells are still confined, and a subsequent invasive stage. Currently older women with in situ cancer are treated to prevent cancer progression, regardless of the fact that treatment may generate undesirable side-effects, and the woman may die of other causes. Younger women tend to have more aggressive cancers, while older women tend to have more indolent tumors. Therefore older women whose in situ tumors show significant dissimilarity with in situ cancer in younger women are less likely to progress, and can thus be considered for watchful waiting. Motivated by this important problem, this work makes two main contributions. First, we present the first multi-relational uplift modeling system, and introduce, implement and evaluate a novel method to guide search in an SRL framework. Second, we compare our algorithm to previous approaches, and demonstrate that the system can indeed obtain differential rules of interest to an expert on real data, while significantly improving the data uplift.

6.
Healthcom ; 2013(15th): 283-285, 2013 Oct 09.
Artículo en Inglés | MEDLINE | ID: mdl-26501132

RESUMEN

When mammography reveals a suspicious finding, a core needle biopsy is usually recommended. In 5% to 15% of these cases, the biopsy diagnosis is non-definitive and a more invasive surgical excisional biopsy is recommended to confirm a diagnosis. The majority of these cases will ultimately be proven benign. The use of excisional biopsy for diagnosis negatively impacts patient quality of life and increases costs to the healthcare system. In this work, we employ a multi-relational machine learning approach to predict when a patient with a non-definitive core needle biopsy diagnosis need not undergo an excisional biopsy procedure because the risk of malignancy is low.

7.
BMC Bioinformatics ; 13: 162, 2012 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-22783946

RESUMEN

BACKGROUND: There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. RESULTS: The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. CONCLUSIONS: In addition to confirming literature results, ProGolem's model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners.


Asunto(s)
Inteligencia Artificial , Hexosas/química , Unión Proteica , Hexosas/metabolismo , Ligandos , Proteínas/química , Proteínas/metabolismo
8.
J Am Med Inform Assoc ; 19(5): 913-6, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22291166

RESUMEN

Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.


Asunto(s)
Mama/patología , Minería de Datos/métodos , Mamografía/clasificación , Procesamiento de Lenguaje Natural , Sistemas de Información Radiológica/clasificación , Algoritmos , Femenino , Humanos , Medición de Riesgo , Sensibilidad y Especificidad , Estados Unidos
9.
Artículo en Inglés | MEDLINE | ID: mdl-23797461

RESUMEN

In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BIRADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parser's performance is comparable to the manual method.

10.
AMIA Annu Symp Proc ; 2012: 1330-9, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23304412

RESUMEN

Overdiagnosis is a phenomenon in which screening identities cancer which may not go on to cause symptoms or death. Women over 65 who develop breast cancer bear the heaviest burden of overdiagnosis. This work introduces novel machine learning algorithms to improve diagnostic accuracy of breast cancer in aging populations. At the same time, we aim at minimizing unnecessary invasive procedures (thus decreasing false positives) and concomitantly addressing overdiagnosis. We develop a novel algorithm. Logical Differential Prediction Bayes Net (LDP-BN), that calculates the risk of breast disease based on mammography findings. LDP-BN uses Inductive Logic Programming (ILP) to learn relational rules, selects older-specific differentially predictive rules, and incorporates them into a Bayes Net, significantly improving its performance. In addition, LDP-BN offers valuable insight into the classification process, revealing novel older-specific rules that link mass presence to invasive, and calcification presence and lack of detectable mass to DCIS.


Asunto(s)
Algoritmos , Inteligencia Artificial , Neoplasias de la Mama/diagnóstico , Carcinoma Intraductal no Infiltrante/diagnóstico , Anciano , Teorema de Bayes , Neoplasias de la Mama/diagnóstico por imagen , Carcinoma Intraductal no Infiltrante/diagnóstico por imagen , Errores Diagnósticos/prevención & control , Femenino , Humanos , Lógica , Mamografía
11.
Inductive Log Program ; 5989: 149-165, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-25309972

RESUMEN

Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings by comparing Inductive Logic Programming (ILP) induced rules to actual biochemical results. We mine the Protein Data Bank for a representative data set of hexose binding sites, non-hexose binding sites and surface grooves. We build an ILP model of hexose-binding sites and evaluate our results against several baseline machine learning classifiers. Our method achieves an accuracy similar to that of other black-box classifiers while providing insight into the discriminating process. In addition, it confirms wet-lab findings and reveals a previously unreported Trp-Glu amino acids dependency.

12.
Proteins ; 77(1): 121-32, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19415755

RESUMEN

Glucose is a simple sugar that plays an essential role in many basic metabolic and signaling pathways. Many proteins have binding sites that are highly specific to glucose. The exponential increase of genomic data has revealed the identity of many proteins that seem to be central to biological processes, but whose exact functions are unknown. Many of these proteins seem to be associated with disease processes. Being able to predict glucose-specific binding sites in these proteins will greatly enhance our ability to annotate protein function and may significantly contribute to drug design. We hereby present the first glucose-binding site classifier algorithm. We consider the sugar-binding pocket as a spherical spatio-chemical environment and represent it as a vector of geometric and chemical features. We then perform Random Forests feature selection to identify key features and analyze them using support vector machines classification. Our work shows that glucose binding sites can be modeled effectively using a limited number of basic chemical and residue features. Using a leave-one-out cross-validation method, our classifier achieves a 8.11% error, a 89.66% sensitivity and a 93.33% specificity over our dataset. From a biochemical perspective, our results support the relevance of ordered water molecules and ions in determining glucose specificity. They also reveal the importance of carboxylate residues in glucose binding and the high concentration of negatively charged atoms in direct contact with the bound glucose molecule.


Asunto(s)
Biología Computacional/métodos , Glucosa/metabolismo , Proteínas/química , Proteínas/metabolismo , Algoritmos , Sitios de Unión , Bases de Datos de Proteínas , Enlace de Hidrógeno , Interacciones Hidrofóbicas e Hidrofílicas , Unión Proteica , Programas Informáticos
13.
Artículo en Inglés | MEDLINE | ID: mdl-23765123

RESUMEN

Breast cancer is the leading cause of cancer mortality in women between the ages of 15 and 54. During mammography screening, radiologists use a strict lexicon (BI-RADS) to describe and report their findings. Mammography records are then stored in a well-defined database format (NMD). Lately, researchers have applied data mining and machine learning techniques to these databases. They successfully built breast cancer classifiers that can help in early detection of malignancy. However, the validity of these models depends on the quality of the underlying databases. Unfortunately, most databases suffer from inconsistencies, missing data, inter-observer variability and inappropriate term usage. In addition, many databases are not compliant with the NMD format and/or solely consist of text reports. BI-RADS feature extraction from free text and consistency checks between recorded predictive variables and text reports are crucial to addressing this problem. We describe a general scheme for concept information retrieval from free text given a lexicon, and present a BI-RADS features extraction algorithm for clinical data mining. It consists of a syntax analyzer, a concept finder and a negation detector. The syntax analyzer preprocesses the input into individual sentences. The concept finder uses a semantic grammar based on the BI-RADS lexicon and the experts' input. It parses sentences detecting BI-RADS concepts. Once a concept is located, a lexical scanner checks for negation. Our method can handle multiple latent concepts within the text, filtering out ultrasound concepts. On our dataset, our algorithm achieves 97.7% precision, 95.5% recall and an F1-score of 0.97. It outperforms manual feature extraction at the 5% statistical significance level.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA