Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Intensive Care Med Exp ; 12(1): 71, 2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39147878

RESUMEN

BACKGROUND: Artificial intelligence, through improved data management and automated summarisation, has the potential to enhance intensive care unit (ICU) care. Large language models (LLMs) can interrogate and summarise large volumes of medical notes to create succinct discharge summaries. In this study, we aim to investigate the potential of LLMs to accurately and concisely synthesise ICU discharge summaries. METHODS: Anonymised clinical notes from ICU admissions were used to train and validate a prompting structure in three separate LLMs (ChatGPT, GPT-4 API and Llama 2) to generate concise clinical summaries. Summaries were adjudicated by staff intensivists on ability to identify and appropriately order a pre-defined list of important clinical events as well as readability, organisation, succinctness, and overall rank. RESULTS: In the development phase, text from five ICU episodes was used to develop a series of prompts to best capture clinical summaries. In the testing phase, a summary produced by each LLM from an additional six ICU episodes was utilised for evaluation. Overall ability to identify a pre-defined list of important clinical events in the summary was 41.5 ± 15.2% for GPT-4 API, 19.2 ± 20.9% for ChatGPT and 16.5 ± 14.1% for Llama2 (p = 0.002). GPT-4 API followed by ChatGPT had the highest score to appropriately order a pre-defined list of important clinical events in the summary as well as readability, organisation, succinctness, and overall rank, whilst Llama2 scored lowest for all. GPT-4 API produced minor hallucinations, which were not present in the other models. CONCLUSION: Differences exist in large language model performance in readability, organisation, succinctness, and sequencing of clinical events compared to others. All encountered issues with narrative coherence and omitted key clinical data and only moderately captured all clinically meaningful data in the correct order. However, these technologies suggest future potential for creating succinct discharge summaries.

2.
BMC Bioinformatics ; 25(1): 62, 2024 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-38326757

RESUMEN

BACKGROUND: Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems. RESULTS: We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches: LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average. CONCLUSIONS: We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.


Asunto(s)
Bases del Conocimiento , Transcriptoma , Humanos , Algoritmos , Aprendizaje Automático
4.
IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2794-2805, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34181549

RESUMEN

One of the key challenges in systems biology is to derive gene regulatory networks (GRNs) from complex high-dimensional sparse data. Bayesian networks (BNs) and dynamic Bayesian networks (DBNs) have been widely applied to infer GRNs from gene expression data. GRNs are typically sparse but traditional approaches of BN structure learning to elucidate GRNs often produce many spurious (false positive) edges. We present two new BN scoring functions, which are extensions to the Bayesian Information Criterion (BIC) score, with additional penalty terms and use them in conjunction with DBN structure search methods to find a graph structure that maximises the proposed scores. Our BN scoring functions offer better solutions for inferring networks with fewer spurious edges compared to the BIC score. The proposed methods are evaluated extensively on auto regressive and DREAM4 benchmarks. We found that they significantly improve the precision of the learned graphs, relative to the BIC score. The proposed methods are also evaluated on three real time series gene expression datasets. The results demonstrate that our algorithms are able to learn sparse graphs from high-dimensional time series data. The implementation of these algorithms is open source and is available in form of an R package on GitHub at https://github.com/HamdaBinteAjmal/DBN4GRN, along with the documentation and tutorials.


Asunto(s)
Biología Computacional , Perfilación de la Expresión Génica , Algoritmos , Teorema de Bayes , Biología Computacional/métodos , Expresión Génica/genética , Redes Reguladoras de Genes/genética , Factores de Tiempo
6.
J Chem Inf Model ; 60(4): 1936-1954, 2020 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-32142271

RESUMEN

This paper presents a new approach to classification of high-dimensional spectroscopy data and demonstrates that it outperforms other current state-of-the art approaches. The specific task we consider is identifying whether samples contain chlorinated solvents or not, based on their Raman spectra. We also examine robustness to classification of outlier samples that are not represented in the training set (negative outliers). A novel application of a locally connected neural network (NN) for the binary classification of spectroscopy data is proposed and demonstrated to yield improved accuracy over traditionally popular algorithms. Additionally, we present the ability to further increase the accuracy of the locally connected NN algorithm through the use of synthetic training spectra, and we investigate the use of autoencoder based one-class classifiers and outlier detectors. Finally, a two-step classification process is presented as an alternative to the binary and one-class classification paradigms. This process combines the locally connected NN classifier, the use of synthetic training data, and an autoencoder based outlier detector to produce a model which is shown to both produce high classification accuracy and be robust in the presence of negative outliers.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Redes Neurales de la Computación , Análisis Espectral
7.
J Chem Inf Model ; 55(5): 963-71, 2015 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-25902003

RESUMEN

Similarity plays a central role in spectral library search. The goal of spectral library search is to identify those spectra in a reference library of known materials that most closely match an unknown query spectrum, on the assumption that this will allow us to identify the main constituent(s) of the query spectrum. The similarity measures used for this task in software and the academic literature are almost exclusively metrics, meaning that the measures obey the three axioms of metrics: (1) minimality; (2) symmetry; (3) triangle inequality. Consequently, they implicitly assume that the query spectrum is drawn from the same distribution as that of the reference library. In this paper, we demonstrate that this assumption is not necessary in practical spectral library search and that in fact it is often violated in practice. Although the reference library may be constructed carefully, it is generally impossible to guarantee that all future query spectra will be drawn from the same distribution as the reference library. Before evaluating different similarity measures, we need to understand how they define the relationship between spectra. In spectral library search, we often aim to find the constituent(s) of a mixture. We propose that, rather than asking which reference library spectra are similar to the mixture, we should ask which of the reference library spectra are contained in the given query mixture. This question is inherently asymmetric. Therefore, we should adopt a nonmetric measure. To evaluate our hypothesis, we apply a nonmetric measure formulated by Tversky [Psychol. Rev. 1977, 84, 327-352] known as the Contrast Model and compare its performance to the well-known Jaccard similarity index metric on spectroscopic data sets. Our results show that the Tversky similarity measure yields better results than the Jaccard index.


Asunto(s)
Minería de Datos/métodos , Descubrimiento de Drogas/métodos , Análisis Espectral , Halogenación
8.
IEEE J Biomed Health Inform ; 18(3): 855-62, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24132025

RESUMEN

Automated screening systems are commonly used to detect some agent in a sample and take a global decision about the subject (e.g., ill/healthy) based on these detections. We propose a Bayesian methodology for taking decisions in (sequential) screening systems that considers the false alarm rate of the detector. Our approach assesses the quality of its decisions and provides lower bounds on the achievable performance of the screening system from the training data. In addition, we develop a complete screening system for sputum smears in tuberculosis diagnosis, and show, using a real-world database, the advantages of the proposed framework when compared to the commonly used count detections and threshold approach.


Asunto(s)
Automatización de Laboratorios/métodos , Técnicas Bacteriológicas/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía/métodos , Tuberculosis/diagnóstico , Tuberculosis/microbiología , Teorema de Bayes , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA