Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Biomed Inform ; 142: 104386, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37178780

RESUMEN

OBJECTIVE: With the onset of the Coronavirus Disease 2019 (COVID-19) pandemic, there has been a surge in the number of publicly available biomedical information sources, which makes it an increasingly challenging research goal to retrieve a relevant text to a topic of interest. In this paper, we propose a Contextual Query Expansion framework based on the clinical Domain knowledge (CQED) for formalizing an effective search over PubMed to retrieve relevant COVID-19 scholarly articles to a given information need. MATERIALS AND METHODS: For the sake of training and evaluation, we use the widely adopted TREC-COVID benchmark. Given a query, the proposed framework utilizes a contextual and a domain-specific neural language model to generate a set of candidate query expansion terms that enrich the original query. Moreover, the framework includes a multi-head attention mechanism that is trained alongside a learning-to-rank model for re-ranking the list of generated expansion candidate terms. The original query and the top-ranked expansion terms are posed to the PubMed search engine for retrieving relevant scholarly articles to an information need. The framework, CQED, can have four different variations, depending upon the learning path adopted for training and re-ranking the candidate expansion terms. RESULTS: The model drastically improves the search performance, when compared to the original query. The performance improvement in comparison to the original query, in terms of RECALL@1000 is 190.85% and in terms of NDCG@1000 is 343.55%. Additionally, the model outperforms all existing state-of-the-art baselines. In terms of P@10, the model that has been optimized based on Precision outperforms all baselines (0.7987). On the other hand, in terms of NDCG@10 (0.7986), MAP (0.3450) and bpref (0.4900), the CQED model that has been optimized based on an average of all retrieval measures outperforms all the baselines. CONCLUSION: The proposed model successfully expands queries posed to PubMed, and improves search performance, as compared to all existing baselines. A success/failure analysis shows that the model improved the search performance of each of the evaluated queries. Moreover, an ablation study depicted that if ranking of generated candidate terms is not conducted, the overall performance decreases. For future work, we would like to explore the application of the presented query expansion framework in conducting technology-assisted Systematic Literature Reviews (SLR).


Asunto(s)
COVID-19 , Almacenamiento y Recuperación de la Información , Humanos , PubMed , Motor de Búsqueda , Semántica
2.
Knowl Inf Syst ; 65(5): 1989-2016, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36643405

RESUMEN

In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community's disjoint efforts on KG completion. We include more components into the architecture of Plumber  to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.

3.
PeerJ Comput Sci ; 8: e1163, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36532807

RESUMEN

With advances in artificial intelligence and semantic technology, search engines are integrating semantics to address complex search queries to improve the results. This requires identification of well-known concepts or entities and their relationship from web page contents. But the increase in complex unstructured data on web pages has made the task of concept identification overly complex. Existing research focuses on entity recognition from the perspective of linguistic structures such as complete sentences and paragraphs, whereas a huge part of the data on web pages exists as unstructured text fragments enclosed in HTML tags. Ontologies provide schemas to structure the data on the web. However, including them in the web pages requires additional resources and expertise from organizations or webmasters and thus becoming a major hindrance in their large-scale adoption. We propose an approach for autonomous identification of entities from short text present in web pages to populate semantic models based on a specific ontology model. The proposed approach has been applied to a public dataset containing academic web pages. We employ a long short-term memory (LSTM) deep learning network and the random forest machine learning algorithm to predict entities. The proposed methodology gives an overall accuracy of 0.94 on the test dataset, indicating a potential for automated prediction even in the case of a limited number of training samples for various entities, thus, significantly reducing the required manual workload in practical applications.

4.
JMIR Med Inform ; 10(10): e39616, 2022 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-36287591

RESUMEN

BACKGROUND: Information retrieval (IR) from the free text within electronic health records (EHRs) is time consuming and complex. We hypothesize that natural language processing (NLP)-enhanced search functionality for EHRs can make clinical workflows more efficient and reduce cognitive load for clinicians. OBJECTIVE: This study aimed to evaluate the efficacy of 3 levels of search functionality (no search, string search, and NLP-enhanced search) in supporting IR for clinical users from the free text of EHR documents in a simulated clinical environment. METHODS: A clinical environment was simulated by uploading 3 sets of patient notes into an EHR research software application and presenting these alongside 3 corresponding IR tasks. Tasks contained a mixture of multiple-choice and free-text questions. A prospective crossover study design was used, for which 3 groups of evaluators were recruited, which comprised doctors (n=19) and medical students (n=16). Evaluators performed the 3 tasks using each of the search functionalities in an order in accordance with their randomly assigned group. The speed and accuracy of task completion were measured and analyzed, and user perceptions of NLP-enhanced search were reviewed in a feedback survey. RESULTS: NLP-enhanced search facilitated more accurate task completion than both string search (5.14%; P=.02) and no search (5.13%; P=.08). NLP-enhanced search and string search facilitated similar task speeds, both showing an increase in speed compared to the no search function, by 11.5% (P=.008) and 16.0% (P=.007) respectively. Overall, 93% of evaluators agreed that NLP-enhanced search would make clinical workflows more efficient than string search, with qualitative feedback reporting that NLP-enhanced search reduced cognitive load. CONCLUSIONS: To the best of our knowledge, this study is the largest evaluation to date of different search functionalities for supporting target clinical users in realistic clinical workflows, with a 3-way prospective crossover study design. NLP-enhanced search improved both accuracy and speed of clinical EHR IR tasks compared to browsing clinical notes without search. NLP-enhanced search improved accuracy and reduced the number of searches required for clinical EHR IR tasks compared to direct search term matching.

5.
JMIR Med Inform ; 10(5): e37215, 2022 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-35476822

RESUMEN

BACKGROUND: With the continuous spread of COVID-19, information about the worldwide pandemic is exploding. Therefore, it is necessary and significant to organize such a large amount of information. As the key branch of artificial intelligence, a knowledge graph (KG) is helpful to structure, reason, and understand data. OBJECTIVE: To improve the utilization value of the information and effectively aid researchers to combat COVID-19, we have constructed and successively released a unified linked data set named OpenKG-COVID19, which is one of the largest existing KGs related to COVID-19. OpenKG-COVID19 includes 10 interlinked COVID-19 subgraphs covering the topics of encyclopedia, concept, medical, research, event, health, epidemiology, goods, prevention, and character. METHODS: In this paper, we introduce the key techniques exploited in building COVID-19 KGs in a top-down manner. First, the schema of the modeling process for each KG in OpenKG-COVID19 is described. Second, we propose different methods for extracting knowledge from open government sites, professional texts, public domain-specific sources, and public encyclopedia sites. The curated 10 COVID-19 KGs are further linked together at both the schema and data levels. In addition, we present the naming convention for OpenKG-COVID19. RESULTS: OpenKG-COVID19 has more than 2572 concepts, 329,600 entities, 513 properties, and 2,687,329 facts, and the data set will be updated continuously. Each COVID-19 KG was evaluated, and the average precision was found to be above 93%. We have developed search and browse interfaces and a SPARQL endpoint to improve user access. Possible intelligent applications based on OpenKG-COVID19 for further development are also described. CONCLUSIONS: A KG is useful for intelligent question-answering, semantic searches, recommendation systems, visualization analysis, and decision-making support. Research related to COVID-19, biomedicine, and many other communities can benefit from OpenKG-COVID19. Furthermore, the 10 KGs will be continuously updated to ensure that the public will have access to sufficient and up-to-date knowledge.

6.
PNAS Nexus ; 1(5): pgac273, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36712330

RESUMEN

Creative problem solving is a fundamental skill of human cognition and is conceived as a search process whereby a novel and appropriate solution is generated. However, it is unclear whether children are able to balance novelty and appropriateness to generate creative solutions and what are the underlying computational mechanisms. Here, we asked children, ranging from 10 to 11 years old, to perform a word association task according to three instructions, which triggered a more appropriate (ordinary), novel (random), or balanced (creative) response. Results revealed that children exhibited greater cognitive flexibility in the creative condition compared to the control conditions, as revealed by the structure and resiliency of the semantic networks. Moreover, responses' word embeddings extracted from pretrained deep neural networks showed that semantic distance and category switching index increased in the creative condition with respect to the ordinary condition and decreased compared to the random condition. Critically, we showed how children efficiently solved the exploration/exploitation trade-off to generate creative associations by fitting a computational reinforcement learning (RL) model that simulates semantic search strategies. Our findings provide compelling evidence that children balance novelty and appropriateness to generate creative associations by optimally regulating the level of exploration in the semantic search. This corroborates previous findings on the adult population and highlights the crucial contribution of both components to the overall creative process. In conclusion, these results shed light on the connections between theoretical concepts such as bottom-up/top-down modes of thinking in creativity research and the exploration/exploitation trade-off in human RL research.

7.
Front Hum Neurosci ; 13: 341, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31680903

RESUMEN

A recent neuropsychological study found that amnesic patients with hippocampal damage (HP) and severe declarative memory impairment produce markedly fewer responses than healthy comparison (CO) participants in a semantic feature generation task (Klooster and Duff, 2015), consistent with the idea that hippocampal damage is associated with semantic cognitive deficits. Participants were presented with a target word and asked to produce as many features of that word as possible (e.g., for target word "book," "read words on a page"). Here, we use the response sequences collected by Klooster and Duff (2015) to develop a vector space model of semantic search. We use this model to characterize the dynamics of semantic feature generation and consider the role of the hippocampus in this search process. Both HP and CO groups tended to initiate the search process with features close in semantic space to the target word, with a gradual decline in similarity to the target word over the first several responses. Adjacent features in the response sequence showed stronger similarity to each other than to non-adjacent features, suggesting that the search process follows a local trajectory in semantic space. Overall, HP patients generated features that were closer in semantic space to the representation of the target word, as compared to the features generated by the CO group, which ranged more widely in semantic space. These results are consistent with a model in which a compound retrieval cue (containing a representation of the target word and a representation of the previous response) is used to probe semantic memory. The model suggests that the HP group's search process is restricted from ranging as far in semantic space from the target word, relative to the CO group. These results place strong constraints on the structure of models of semantic memory search, and on the role of hippocampus in probing semantic memory.

8.
Expert Opin Drug Discov ; 14(5): 433-444, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30884989

RESUMEN

INTRODUCTION: The use of semantic web technologies to aid drug discovery has gained momentum over recent years. Researchers in this domain have realized that semantic web technologies are key to dealing with the high levels of data for drug discovery. These technologies enable us to represent the data in a formal, structured, interoperable and comparable way, and to tease out undiscovered links between drug data (be it identifying new drug-targets or relevant compounds, or links between specific drugs and diseases). Areas covered: This review focuses on explaining how semantic web technologies are being used to aid advances in drug discovery. The main types of semantic web technologies are explained, outlining how they work and how they can be used in the drug discovery process, with a consideration of how the use of these technologies has progressed from their initial usage. Expert opinion: The increased availability of shared semantic resources (tools, data and importantly the communities) have enabled the application of semantic web technologies to facilitate semantic (context dependent) search across multiple data sources, which can be used by machine learning to produce better predictions by exploiting the semantic links in knowledge graphs and linked datasets.


Asunto(s)
Descubrimiento de Drogas/métodos , Web Semántica , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático
9.
BMC Med Inform Decis Mak ; 18(Suppl 2): 57, 2018 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-30066657

RESUMEN

BACKGROUND: Acute lymphoblastic leukemia is the most prevalent neoplasia among children. Despite the tremendous achievements of state-of-the-art treatment strategies, drug resistance is still a major cause of chemotherapy failure leading to relapse in pediatric acute lymphoblastic leukemia. The underlying mechanisms of such phenomenon are not yet clear and subject to further exploration. Prior research has shown that microRNAs can act as post-transcriptional regulators of many genes related to drug resistance. However, details of microRNA regulation mechanisms in pediatric acute lymphoblastic leukemia are far from completely understood. METHODS: We utilized a computational approach based upon emerging biomedical and biological ontologies and semantic technologies to investigate the important roles of microRNA: mRNA regulation on glucocorticoid resistance in pediatric acute lymphoblastic leukemia. In particular, various filtering mechanisms were designed based on the user-provided MeSH term to narrow down the most promising microRNAs in an effective manner. RESULTS: During our manual search on background literature, we found a total of 18 candidate microRNAs that possibly regulate glucocorticoid resistance in pediatric acute lymphoblastic leukemia. After the first-round filtering using the Broader-Match option where both the user-provided MeSH term and its direct parent term were utilized, the number of targets for 18 microRNAs was reduced from 232 to 74. During the second-round filtering with the Exact-Match option where only the MeSH term itself was utilized, the number of targets was further reduced to 19. Finally, we conducted semantic searches in the OmniSearch software tool on the five likely regulating microRNAs and identified two most likely microRNAs. CONCLUSIONS: We successfully identified two microRNAs, hsa-miR-142-3p and hsa-miR-17-5p, which are computationally predicted to closely relate to glucocorticoid resistance, thus potentially serving as novel biomarkers and therapeutic targets in pediatric acute lymphoblastic leukemia.


Asunto(s)
Glucocorticoides/administración & dosificación , MicroARNs/efectos de los fármacos , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Semántica , Niño , Preescolar , Humanos , Almacenamiento y Recuperación de la Información , Masculino , Errores Innatos del Metabolismo , Receptores de Glucocorticoides/deficiencia
10.
Methods ; 145: 60-66, 2018 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-29702223

RESUMEN

Prior research has indicated that as an important biomarker of chronic low-grade inflammation, high-sensitivity C-reactive protein (hs-CRP) can play important roles on the onset of metabolic syndrome and cardiovascular diseases (CVD). We conducted an integrative approach, which combines biological wet-lab experiments, statistical analysis, and semantics-oriented bioinformatics & computational analysis, to investigate the association among hs-CRP, body fat mass (FM) distribution, and other cardiometabolic risk factors in young healthy women. Research outcomes in this study resulted in two novel discoveries. Discovery 1: There are four primary determinants for hs-CRP, i.e., central/abdominal FM (a.k.a. trunk FM) accumulation, leptin, high density lipoprotein cholesterol (HDL-C), and plasminogen activator inhibitior-1 (PAI-1). Discovery 2: Chronic inflammation may involve in adipocyte-cytokine interaction underlying the metabolic derangement in healthy young women.


Asunto(s)
Distribución de la Grasa Corporal , Proteína C-Reactiva/análisis , Enfermedades Cardiovasculares/etiología , Biología Computacional , Adolescente , Enfermedades Cardiovasculares/sangre , Enfermedades Cardiovasculares/epidemiología , HDL-Colesterol/sangre , Femenino , Humanos , Leptina/sangre , Modelos Biológicos , Inhibidor 1 de Activador Plasminogénico/sangre , Factores de Riesgo , Adulto Joven
11.
Front Psychol ; 8: 99, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28210234

RESUMEN

Generating associations is important for cognitive tasks including language acquisition and creative problem solving. It remains an open question how the brain represents and processes associations. The Remote Associates Test (RAT) is a task, originally used in creativity research, that is heavily dependent on generating associations in a search for the solutions to individual RAT problems. In this work we present a model that solves the test. Compared to earlier modeling work on the RAT, our hybrid (i.e., non-developmental) model is implemented in a spiking neural network by means of the Neural Engineering Framework (NEF), demonstrating that it is possible for spiking neurons to be organized to store the employed representations and to manipulate them. In particular, the model shows that distributed representations can support sophisticated linguistic processing. The model was validated on human behavioral data including the typical length of response sequences and similarity relationships in produced responses. These data suggest two cognitive processes that are involved in solving the RAT: one process generates potential responses and a second process filters the responses.

12.
J Biomed Semantics ; 7: 25, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27175225

RESUMEN

As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.


Asunto(s)
Biología Computacional/métodos , Epistasis Genética/genética , Ontología de Genes , MicroARNs/genética , Semántica , Interfaz Usuario-Computador
13.
Brief Funct Genomics ; 14(3): 213-30, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-24907365

RESUMEN

The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.


Asunto(s)
Investigación Biomédica , Biología Computacional/métodos , Minería de Datos/métodos , Genómica , Humanos , Fenotipo , Semántica
14.
Int J Antimicrob Agents ; 44(5): 424-30, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25216545

RESUMEN

Monitoring the rapid global spread of antimicrobial resistance requires an over-regional and fast surveillance tool. Data from major surveillance studies based on aggregated results of selected sentinel laboratories or retrospective strain collections are not available for the whole scientific community and are limited by time and region. Thus, we tested an alternative approach to monitor resistance trends by automated semantic and scientometric analysis of all (>100000) related PubMed entries. A semantic search was done using 'Gene Ontology' and MeSH vocabulary and additional search terms for further data refinement. Data extraction was performed using the semantic search engine 'GoPubMed'. The timely relationship between introduction of novel ß-lactam antibiotic classes into the market and emergence of respective resistance was investigated using nearly 22300 publications over the last 70 years. Further analysis was done with around 54000 publications related to 'infectious diseases' and an additional 50000 publications related to 'antimicrobial resistance' to estimate current trends in publication interest regarding resistance development since 1940. Scientometric results were compared with data from the major surveillance network EARS-Net. Furthermore, the relationship between micro-organism, year and antibiotic market introduction was investigated for eight key antibiotics using nearly 37500 publications. Owing to influencing factors such as availability of alternative antibiotics, scientometric analysis correlated only partly with resistance development. However, it provides a fast, reliable and global overview of the clinical and public health importance of a specific resistance including the period of the 1940s-1980s, when resistance surveillance studies were not yet established.


Asunto(s)
Antibacterianos/farmacología , Bacterias/efectos de los fármacos , Infecciones Bacterianas/epidemiología , Infecciones Bacterianas/microbiología , Bibliometría , Farmacorresistencia Bacteriana , Monitoreo Epidemiológico , Humanos , Análisis Espacio-Temporal , Factores de Tiempo
15.
Toxicol In Vitro ; 28(4): 571-87, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24389116

RESUMEN

The knowledge-based search engine Go3R, www.Go3R.org, has been developed to assist scientists from industry and regulatory authorities in collecting comprehensive toxicological information with a special focus on identifying available alternatives to animal testing. The semantic search paradigm of Go3R makes use of expert knowledge on 3Rs methods and regulatory toxicology, laid down in the ontology, a network of concepts, terms, and synonyms, to recognize the contents of documents. Search results are automatically sorted into a dynamic table of contents presented alongside the list of documents retrieved. This table of contents allows the user to quickly filter the set of documents by topics of interest. Documents containing hazard information are automatically assigned to a user interface following the endpoint-specific IUCLID5 categorization scheme required, e.g. for REACH registration dossiers. For this purpose, complex endpoint-specific search queries were compiled and integrated into the search engine (based upon a gold standard of 310 references that had been assigned manually to the different endpoint categories). Go3R sorts 87% of the references concordantly into the respective IUCLID5 categories. Currently, Go3R searches in the 22 million documents available in the PubMed and TOXNET databases. However, it can be customized to search in other databases including in-house databanks.


Asunto(s)
Alternativas a las Pruebas en Animales/métodos , Bases de Datos Factuales , Sustancias Peligrosas/toxicidad , Motor de Búsqueda , Terminología como Asunto , Bienestar del Animal , Animales , Investigación Biomédica/métodos , Documentación , Proyectos de Investigación
16.
Web Semant ; 29: 39-52, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25814917

RESUMEN

While contemporary semantic search systems offer to improve classical keyword-based search, they are not always adequate for complex domain specific information needs. The domain of prescription drug abuse, for example, requires knowledge of both ontological concepts and "intelligible constructs" not typically modeled in ontologies. These intelligible constructs convey essential information that include notions of intensity, frequency, interval, dosage and sentiments, which could be important to the holistic needs of the information seeker. In this paper, we present a hybrid approach to domain specific information retrieval that integrates ontology-driven query interpretation with synonym-based query expansion and domain specific rules, to facilitate search in social media on prescription drug abuse. Our framework is based on a context-free grammar (CFG) that defines the query language of constructs interpretable by the search system. The grammar provides two levels of semantic interpretation: 1) a top-level CFG that facilitates retrieval of diverse textual patterns, which belong to broad templates and 2) a low-level CFG that enables interpretation of specific expressions belonging to such textual patterns. These low-level expressions occur as concepts from four different categories of data: 1) ontological concepts, 2) concepts in lexicons (such as emotions and sentiments), 3) concepts in lexicons with only partial ontology representation, called lexico-ontology concepts (such as side effects and routes of administration (ROA)), and 4) domain specific expressions (such as date, time, interval, frequency and dosage) derived solely through rules. Our approach is embodied in a novel Semantic Web platform called PREDOSE, which provides search support for complex domain specific information needs in prescription drug abuse epidemiology. When applied to a corpus of over 1 million drug abuse-related web forum posts, our search framework proved effective in retrieving relevant documents when compared with three existing search systems.

17.
Earth Sci Inform ; 6(3)2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24416086

RESUMEN

Linked Science is the practice of inter-connecting scientific assets by publishing, sharing and linking scientific data and processes in end-to-end loosely coupled workflows that allow the sharing and re-use of scientific data. Much of this data does not live in the cloud or on the Web, but rather in multi-institutional data centers that provide tools and add value through quality assurance, validation, curation, dissemination, and analysis of the data. In this paper, we make the case for the use of scientific scenarios in Linked Science. We propose a scenario in river-channel transport that requires biogeochemical experimental data and global climate-simulation model data from many sources. We focus on the use of ontologies-formal machine-readable descriptions of the domain-to facilitate search and discovery of this data. Mercury, developed at Oak Ridge National Laboratory, is a tool for distributed metadata harvesting, search and retrieval. Mercury currently provides uniform access to more than 100,000 metadata records; 30,000 scientists use it each month. We augmented search in Mercury with ontologies, such as the ontologies in the Semantic Web for Earth and Environmental Terminology (SWEET) collection by prototyping a component that provides access to the ontology terms from Mercury. We evaluate the coverage of SWEET for the ORNL Distributed Active Archive Center (ORNL DAAC).

18.
Australas Med J ; 5(9): 482-8, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23115582

RESUMEN

BACKGROUND: This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching. AIM: The concept-based approach is intended to overcome specific challenges we identified in searching medical records. METHOD: Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology. RESULTS: Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision. CONCLUSION: The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.

19.
Front Genet ; 3: 111, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22737162

RESUMEN

An initiative of the NIH Blueprint for neuroscience research, the Neuroscience Information Framework (NIF) project advances neuroscience by enabling discovery and access to public research data and tools worldwide through an open source, semantically enhanced search portal. One of the critical components for the overall NIF system, the NIF Standardized Ontologies (NIFSTD), provides an extensive collection of standard neuroscience concepts along with their synonyms and relationships. The knowledge models defined in the NIFSTD ontologies enable an effective concept-based search over heterogeneous types of web-accessible information entities in NIF's production system. NIFSTD covers major domains in neuroscience, including diseases, brain anatomy, cell types, sub-cellular anatomy, small molecules, techniques, and resource descriptors. Since the first production release in 2008, NIF has grown significantly in content and functionality, particularly with respect to the ontologies and ontology-based services that drive the NIF system. We present here on the structure, design principles, community engagement, and the current state of NIFSTD ontologies.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA