Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 799
Filtrar
1.
Heliyon ; 10(16): e35865, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39220956

RESUMEN

The digital era has expanded social exposure with easy internet access for mobile users, allowing for global communication. Now, people can get to know what is going on around the globe with just a click; however, this has also resulted in the issue of fake news. Fake news is content that pretends to be true but is actually false and is disseminated to defraud. Fake news poses a threat to harmony, politics, the economy, and public opinion. As a result, bogus news detection has become an emerging research domain to identify a given piece of text as genuine or fraudulent. In this paper, a new framework called Generative Bidirectional Encoder Representations from Transformers (GBERT) is proposed that leverages a combination of Generative pre-trained transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) and addresses the fake news classification problem. This framework combines the best features of both cutting-edge techniques-BERT's deep contextual understanding and the generative capabilities of GPT-to create a comprehensive representation of a given text. Both GPT and BERT are fine-tuned on two real-world benchmark corpora and have attained 95.30 % accuracy, 95.13 % precision, 97.35 % sensitivity, and a 96.23 % F1 score. The statistical test results indicate the effectiveness of the fine-tuned framework for fake news detection and suggest that it can be a promising approach for eradicating this global issue of fake news in the digital landscape.

2.
Digit Health ; 10: 20552076241277458, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39221085

RESUMEN

Background: Professional opinion polling has become a popular means of seeking advice for complex nephrology questions in the #AskRenal community on X. ChatGPT is a large language model with remarkable problem-solving capabilities, but its ability to provide solutions for real-world clinical scenarios remains unproven. This study seeks to evaluate how closely ChatGPT's responses align with current prevailing medical opinions in nephrology. Methods: Nephrology polls from X were submitted to ChatGPT-4, which generated answers without prior knowledge of the poll outcomes. Its responses were compared to the poll results (inter-rater) and a second set of responses given after a one-week interval (intra-rater) using Cohen's kappa statistic (κ). Subgroup analysis was performed based on question subject matter. Results: Our analysis comprised two rounds of testing ChatGPT on 271 nephrology-related questions. In the first round, ChatGPT's responses agreed with poll results for 163 of the 271 questions (60.2%; κ = 0.42, 95% CI: 0.38-0.46). In the second round, conducted to assess reproducibility, agreement improved slightly to 171 out of 271 questions (63.1%; κ = 0.46, 95% CI: 0.42-0.50). Comparison of ChatGPT's responses between the two rounds demonstrated high internal consistency, with agreement in 245 out of 271 responses (90.4%; κ = 0.86, 95% CI: 0.82-0.90). Subgroup analysis revealed stronger performance in the combined areas of homeostasis, nephrolithiasis, and pharmacology (κ = 0.53, 95% CI: 0.47-0.59 in both rounds), compared to other nephrology subfields. Conclusion: ChatGPT-4 demonstrates modest capability in replicating prevailing professional opinion in nephrology polls overall, with varying performance levels between question topics and excellent internal consistency. This study provides insights into the potential and limitations of using ChatGPT in medical decision making.

3.
Am J Hum Genet ; 2024 Sep 04.
Artículo en Inglés | MEDLINE | ID: mdl-39255797

RESUMEN

Phenotype-driven gene prioritization is fundamental to diagnosing rare genetic disorders. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models (LLMs) promise a streamlined text-to-gene solution. In this study, we evaluated five LLMs, including two generative pre-trained transformers (GPT) series and three Llama2 series, assessing their performance across task completeness, gene prediction accuracy, and adherence to required output structures. We conducted experiments, exploring various combinations of models, prompts, phenotypic input types, and task difficulty levels. Our findings revealed that the best-performed LLM, GPT-4, achieved an average accuracy of 17.0% in identifying diagnosed genes within the top 50 predictions, which still falls behind traditional tools. However, accuracy increased with the model size. Consistent results were observed over time, as shown in the dataset curated after 2023. Advanced techniques such as retrieval-augmented generation (RAG) and few-shot learning did not improve the accuracy. Sophisticated prompts were more likely to enhance task completeness, especially in smaller models. Conversely, complicated prompts tended to decrease output structure compliance rate. LLMs also achieved better-than-random prediction accuracy with free-text input, though performance was slightly lower than with standardized concept input. Bias analysis showed that highly cited genes, such as BRCA1, TP53, and PTEN, are more likely to be predicted. Our study provides valuable insights into integrating LLMs with genomic analysis, contributing to the ongoing discussion on their utilization in clinical workflows.

4.
J Comput Biol ; 2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39246251

RESUMEN

The identification of intrinsically disordered proteins and their functional roles is largely dependent on the performance of computational predictors, necessitating a high standard of accuracy in these tools. In this context, we introduce a novel series of computational predictors, termed PDFll (Predictors of Disorder and Function of proteins from the Language of Life), which are designed to offer precise predictions of protein disorder and associated functional roles based on protein sequences. PDFll is developed through a two-step process. Initially, it leverages large-scale protein language models (pLMs), trained on an extensive dataset comprising billions of protein sequences. Subsequently, the embeddings derived from pLMs are integrated into streamlined, yet sophisticated, deep-learning models to generate predictions. These predictions notably surpass the performance of existing state-of-the-art predictors, particularly those that forecast disorder and function without utilizing evolutionary information.

5.
Heliyon ; 10(16): e35941, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39253130

RESUMEN

This paper presents a novel approach for a low-cost simulator-based driving assessment system incorporating a speech-based assistant, using pre-generated messages from Generative AI to achieve real-time interaction during the assessment. Simulator-based assessment is a crucial apparatus in the research toolkit for various fields. Traditional assessment approaches, like on-road evaluation, though reliable, can be risky, costly, and inaccessible. Simulator-based assessment using stationary driving simulators offers a safer evaluation and can be tailored to specific needs. However, these simulators are often only available to research-focused institutions due to their cost. To address this issue, our study proposes a system with the aforementioned properties aiming to enhance drivers' situational awareness, and foster positive emotional states, i.e., high valence and medium arousal, while assessing participants to prevent subpar performers from proceeding to the next stages of assessment and/or rehabilitation. In addition, this study introduces the speech-based assistant which provides timely guidance adaptable to the ever-changing context of the driving environment and vehicle state. The study's preliminary outcomes reveal encouraging progress, highlighting improved driving performance and positive emotional states when participants are engaged with the assistant during the assessment.

6.
JMIR Form Res ; 8: e56797, 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39265163

RESUMEN

BACKGROUND: The public launch of OpenAI's ChatGPT platform generated immediate interest in the use of large language models (LLMs). Health care institutions are now grappling with establishing policies and guidelines for the use of these technologies, yet little is known about how health care providers view LLMs in medical settings. Moreover, there are no studies assessing how pediatric providers are adopting these readily accessible tools. OBJECTIVE: The aim of this study was to determine how pediatric providers are currently using LLMs in their work as well as their interest in using a Health Insurance Portability and Accountability Act (HIPAA)-compliant version of ChatGPT in the future. METHODS: A survey instrument consisting of structured and unstructured questions was iteratively developed by a team of informaticians from various pediatric specialties. The survey was sent via Research Electronic Data Capture (REDCap) to all Boston Children's Hospital pediatric providers. Participation was voluntary and uncompensated, and all survey responses were anonymous. RESULTS: Surveys were completed by 390 pediatric providers. Approximately 50% (197/390) of respondents had used an LLM; of these, almost 75% (142/197) were already using an LLM for nonclinical work and 27% (52/195) for clinical work. Providers detailed the various ways they are currently using an LLM in their clinical and nonclinical work. Only 29% (n=105) of 362 respondents indicated that ChatGPT should be used for patient care in its present state; however, 73.8% (273/368) reported they would use a HIPAA-compliant version of ChatGPT if one were available. Providers' proposed future uses of LLMs in health care are described. CONCLUSIONS: Despite significant concerns and barriers to LLM use in health care, pediatric providers are already using LLMs at work. This study will give policy makers needed information about how providers are using LLMs clinically.


Asunto(s)
Personal de Salud , Humanos , Estudios Transversales , Personal de Salud/estadística & datos numéricos , Encuestas y Cuestionarios , Femenino , Masculino , Pediatría , Boston , Adulto , Health Insurance Portability and Accountability Act , Estados Unidos
7.
Artículo en Inglés | MEDLINE | ID: mdl-39268568

RESUMEN

Artificially intelligent physical activity digital assistants that use the full spectrum of machine learning capabilities have not yet been developed and examined. This study aimed to explore potential users' perceptions and expectations of using such a digital assistant. Six 90-min online focus group meetings (n = 45 adults) were conducted. Meetings were recorded, transcribed and thematically analysed. Participants embraced the idea of a 'digital assistant' providing physical activity support. Participants indicated they would like to receive notifications from the digital assistant, but did not agree on the number, timing, tone and content of notifications. Likewise, they indicated that the digital assistant's personality and appearance should be customisable. Participants understood the need to provide information to the digital assistant to allow for personalisation, but varied greatly in the extent of information that they were willing to provide. Privacy issues aside, participants embraced the idea of using artificial intelligence or machine learning in return for a more functional and personal digital assistant. In sum, participants were ready for an artificially intelligent physical activity digital assistant but emphasised a need to personalise or customise nearly every feature of the application. This poses challenges in terms of cost and complexity of developing the application.

8.
J Biomed Inform ; 157: 104720, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39233209

RESUMEN

BACKGROUND: In oncology, electronic health records contain textual key information for the diagnosis, staging, and treatment planning of patients with cancer. However, text data processing requires a lot of time and effort, which limits the utilization of these data. Recent advances in natural language processing (NLP) technology, including large language models, can be applied to cancer research. Particularly, extracting the information required for the pathological stage from surgical pathology reports can be utilized to update cancer staging according to the latest cancer staging guidelines. OBJECTIVES: This study has two main objectives. The first objective is to evaluate the performance of extracting information from text-based surgical pathology reports and determining pathological stages based on the extracted information using fine-tuned generative language models (GLMs) for patients with lung cancer. The second objective is to determine the feasibility of utilizing relatively small GLMs for information extraction in a resource-constrained computing environment. METHODS: Lung cancer surgical pathology reports were collected from the Common Data Model database of Seoul National University Bundang Hospital (SNUBH), a tertiary hospital in Korea. We selected 42 descriptors necessary for tumor-node (TN) classification based on these reports and created a gold standard with validation by two clinical experts. The pathology reports and gold standard were used to generate prompt-response pairs for training and evaluating GLMs which then were used to extract information required for staging from pathology reports. RESULTS: We evaluated the information extraction performance of six trained models as well as their performance in TN classification using the extracted information. The Deductive Mistral-7B model, which was pre-trained with the deductive dataset, showed the best performance overall, with an exact match ratio of 92.24% in the information extraction problem and an accuracy of 0.9876 (predicting T and N classification concurrently) in classification. CONCLUSION: This study demonstrated that training GLMs with deductive datasets can improve information extraction performance, and GLMs with a relatively small number of parameters at approximately seven billion can achieve high performance in this problem. The proposed GLM-based information extraction method is expected to be useful in clinical decision-making support, lung cancer staging and research.


Asunto(s)
Neoplasias Pulmonares , Procesamiento de Lenguaje Natural , Estadificación de Neoplasias , Neoplasias Pulmonares/patología , Neoplasias Pulmonares/diagnóstico , Humanos , Estadificación de Neoplasias/métodos , Registros Electrónicos de Salud , Minería de Datos/métodos , Algoritmos , Bases de Datos Factuales
9.
Clin Imaging ; 114: 110271, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39236553

RESUMEN

The advent of large language models (LLMs) marks a transformative leap in natural language processing, offering unprecedented potential in radiology, particularly in enhancing the accuracy and efficiency of coronary artery disease (CAD) diagnosis. While previous studies have explored the capabilities of specific LLMs like ChatGPT in cardiac imaging, a comprehensive evaluation comparing multiple LLMs in the context of CAD-RADS 2.0 has been lacking. This study addresses this gap by assessing the performance of various LLMs, including ChatGPT 4, ChatGPT 4o, Claude 3 Opus, Gemini 1.5 Pro, Mistral Large, Meta Llama 3 70B, and Perplexity Pro, in answering 30 multiple-choice questions derived from the CAD-RADS 2.0 guidelines. Our findings reveal that ChatGPT 4o achieved the highest accuracy at 100 %, with ChatGPT 4 and Claude 3 Opus closely following at 96.6 %. Other models, including Mistral Large, Perplexity Pro, Meta Llama 3 70B, and Gemini 1.5 Pro, also demonstrated commendable performance, though with slightly lower accuracy ranging from 90 % to 93.3 %. This study underscores the proficiency of current LLMs in understanding and applying CAD-RADS 2.0, suggesting their potential to significantly enhance radiological reporting and patient care in coronary artery disease. The variations in model performance highlight the need for further research, particularly in evaluating the visual diagnostic capabilities of LLMs-a critical component of radiology practice. This study provides a foundational comparison of LLMs in CAD-RADS 2.0 and sets the stage for future investigations into their broader applications in radiology, emphasizing the importance of integrating both text-based and visual knowledge for optimal clinical outcomes.


Asunto(s)
Angiografía por Tomografía Computarizada , Angiografía Coronaria , Enfermedad de la Arteria Coronaria , Procesamiento de Lenguaje Natural , Humanos , Angiografía por Tomografía Computarizada/métodos , Enfermedad de la Arteria Coronaria/diagnóstico por imagen , Angiografía Coronaria/métodos , Reproducibilidad de los Resultados
10.
Sensors (Basel) ; 24(17)2024 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-39275711

RESUMEN

As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs. However, the existing image processing technique is heavily reliant on image annotations. It is time consuming to build a high-quality dataset with diverse training images and human annotations. In this paper, we introduce the utilization of Vision-language Models (VLMs) in the traffic sign detection task. Without the need for discrete image labels, the rapid deployment is fulfilled by the multi-modal learning and large-scale pretrained networks. First, we compile a keyword dictionary to explain traffic signs. The Chinese national standard is used to suggest the shape and color information. Our program conducts Bootstrapping Language-image Pretraining v2 (BLIPv2) to translate representative images into text descriptions. Second, a Contrastive Language-image Pretraining (CLIP) framework is applied to characterize not only drone images but also text descriptions. Our method utilizes the pretrained encoder network to create visual features and word embeddings. Third, the category of each traffic sign is predicted according to the similarity between drone images and keywords. Cosine distance and softmax function are performed to calculate the class probability distribution. To evaluate the performance, we apply the proposed method in a practical application. The drone images captured from Guyuan, China, are employed to record the conditions of traffic signs. Further experiments include two widely used public datasets. The calculation results indicate that our vision-language model-based method has an acceptable prediction accuracy and low training cost.

11.
Epilepsy Res ; 207: 107451, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-39276641

RESUMEN

OBJECTIVES: Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP). METHODS: We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure ("today", "1-6 days ago", "1-4 weeks ago", "more than 1-3 months ago", "more than 3-6 months ago", "more than 6-12 months ago", "more than 1-2 years ago", "more than 2 years ago") and seizure frequency ("innumerable", "multiple", "daily", "weekly", "monthly", "once per year", "less than once per year"). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping. RESULTS: Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00-4.86) weeks, and for seizure frequency of 0.02 (0.02-0.02) seizures per day. CONCLUSIONS: Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.

12.
Int J Biol Macromol ; : 135599, 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39276905

RESUMEN

The computational identification of nucleic acid-binding proteins (NABP) is of great significance for understanding the mechanisms of these biological activities and drug discovery. Although a bunch of sequence-based methods have been proposed to predict NABP and achieved promising performance, the structure information is often overlooked. On the other hand, the power of popular protein language models (pLM) has seldom been harnessed for predicting NABPs. In this study, we propose a novel framework called GraphNABP, to predict NABP by integrating sequence and predicted 3D structure information. Specifically, sequence embeddings and protein molecular graphs were first obtained from ProtT5 protein language model and predicted 3D structures, respectively. Then, graph attention (GAT) and bidirectional long short-term memory (BiLSTM) neural networks were used to enhance feature representations. Finally, a fully connected layer is used to predict NABPs. To the best of our knowledge, this is the first time to integrate AlphaFold and protein language models for the prediction of NABPs. The performances on multiple independent test sets indicate that GraphNABP outperforms other state-of-the-art methods. Our results demonstrate the effectiveness of pLM embeddings and structural information for NABP prediction. The codes and data used in this study are available at https://github.com/lixiangli01/GraphNABP.

13.
Semin Vasc Surg ; 37(3): 314-320, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39277347

RESUMEN

Natural language processing is a subfield of artificial intelligence that aims to analyze human oral or written language. The development of large language models has brought innovative perspectives in medicine, including the potential use of chatbots and virtual assistants. Nevertheless, the benefits and pitfalls of such technology need to be carefully evaluated before their use in health care. The aim of this narrative review was to provide an overview of potential applications of large language models and artificial intelligence chatbots in the field of vascular surgery, including clinical practice, research, and education. In light of the results, we discuss current limits and future directions.


Asunto(s)
Inteligencia Artificial , Procesamiento de Lenguaje Natural , Procedimientos Quirúrgicos Vasculares , Humanos
14.
Artículo en Inglés | MEDLINE | ID: mdl-39278360

RESUMEN

BACKGROUND: The rate of diagnosis of mast cell activation syndrome (MCAS) has increased since the disorder's original description as a mastocytosis-like phenotype. While a set of consortium MCAS criteria is well described and widely accepted, this increase occurs in the setting of a broader set of proposed alternative MCAS criteria. OBJECTIVE: Effective diagnostic criteria must minimize the range of unrelated diagnoses that can be erroneously classified as the condition of interest. We sought to determine if the symptoms associated with alternative MCAS criteria result in less concise or consistent diagnostic alternatives, reducing diagnostic specificity. METHODS: We used multiple large language models, including ChatGPT, Claude, and Gemini, to bootstrap the probabilities of diagnoses that are compatible with consortium or alternative MCAS criteria. We utilized diversity and network analysis to quantify diagnostic precision and specificity compared to control diagnostic criteria including systemic lupus erythematosus (SLE), Kawasaki disease, and migraines. RESULTS: Compared to consortium MCAS criteria, alternative MCAS criteria are associated with more variable (Shannon diversity 5.8 vs. 4.6, respectively; p-value=0.004) and less precise (mean Bray-Curtis similarity 0.07 vs 0.19, respectively; p-value=0.004) diagnoses. The diagnosis networks derived from consortium and alternative MCAS criteria had lower between-network similarity compared to the similarity between diagnosis networks derived from two distinct SLE criteria (cosine similarity 0.55 vs. 0.86, respectively; p-value=0.0022). CONCLUSION: Alternative MCAS criteria are associated with a distinct set of diagnoses compared to consortium MCAS criteria and have lower diagnostic consistency. This lack of specificity is pronounced in relation to multiple control criteria, raising the concern that alternative criteria could disproportionately contribute to MCAS overdiagnosis, to the exclusion of more appropriate diagnoses.

15.
Artículo en Inglés | MEDLINE | ID: mdl-39278616

RESUMEN

OBJECTIVES: The task of writing structured content reviews and guidelines has grown stronger and more complex. We propose to go beyond search tools, toward curation tools, by automating time-consuming and repetitive steps of extracting and organizing information. METHODS: SciScribe is built as an extension of IBM's Deep Search platform, which provides document processing and search capabilities. This platform was used to ingest and search full-content publications from PubMed Central (PMC) and official, structured records from the ClinicalTrials and OpenPayments databases. Author names and NCT numbers, mentioned within the publications, were used to link publications to these official records as context. Search strategies involve traditional keyword-based search as well as natural language question and answering via large language models (LLMs). RESULTS: SciScribe is a web-based tool that helps accelerate literature reviews through key features: 1. Accumulate a personal collection from publication sources, such as PMC or other sources; 2. Incorporate contextual information from external databases into the presented papers, promoting a more informed assessment by readers. 3. Semantic question and answering of a document to quickly assess relevance and hierarchical organization. 4. Semantic question answering for each document within a collection, collated into tables. CONCLUSIONS: Emergent language processing techniques open new avenues to accelerate and enhance the literature review process, for which we have demonstrated a use case implementation within cardiac surgery. SciScribe automates and accelerates this process, mitigates errors associated with repetition and fatigue, as well as contextualizes results by linking relevant external data sources, instantaneously.

16.
JMIR Med Inform ; 12: e58478, 2024 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-39235317

RESUMEN

Unlabelled: With the popularization of large language models (LLMs), strategies for their effective and safe usage in health care and research have become increasingly pertinent. Despite the growing interest and eagerness among health care professionals and scientists to exploit the potential of LLMs, initial attempts may yield suboptimal results due to a lack of user experience, thus complicating the integration of artificial intelligence (AI) tools into workplace routine. Focusing on scientists and health care professionals with limited LLM experience, this viewpoint article highlights and discusses 6 easy-to-implement use cases of practical relevance. These encompass customizing translations, refining text and extracting information, generating comprehensive overviews and specialized insights, compiling ideas into cohesive narratives, crafting personalized educational materials, and facilitating intellectual sparring. Additionally, we discuss general prompting strategies and precautions for the implementation of AI tools in biomedicine. Despite various hurdles and challenges, the integration of LLMs into daily routines of physicians and researchers promises heightened workplace productivity and efficiency.

17.
Comput Struct Biotechnol J ; 23: 3254-3257, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-39286528

RESUMEN

Introduction: OpenAI's ChatGPT, a Large Language Model (LLM), is a powerful tool across domains, designed for text and code generation, fostering collaboration, especially in public health. Investigating the role of this advanced LLM chatbot in assisting public health practitioners in shaping disease transmission models to inform infection control strategies, marks a new era in infectious disease epidemiology research. This study used a case study to illustrate how ChatGPT collaborates with a public health practitioner in co-designing a mathematical transmission model. Methods: Using natural conversation, the practitioner initiated a dialogue involving an iterative process of code generation, refinement, and debugging with ChatGPT to develop a model to fit 10 days of prevalence data to estimate two key epidemiological parameters: i) basic reproductive number (Ro) and ii) final epidemic size. Verification and validation processes are conducted to ensure the accuracy and functionality of the final model. Results: ChatGPT developed a validated transmission model which replicated the epidemic curve and gave estimates of Ro of 4.19 (95 % CI: 4.13- 4.26) and a final epidemic size of 98.3 % of the population within 60 days. It highlighted the advantages of using maximum likelihood estimation with Poisson distribution over least squares method. Conclusion: Integration of LLM in medical research accelerates model development, reducing technical barriers for health practitioners, democratizing access to advanced modeling and potentially enhancing pandemic preparedness globally, particularly in resource-constrained populations.

19.
Artículo en Inglés | MEDLINE | ID: mdl-39283994

RESUMEN

Machine learning and data-driven methods have attracted a significant amount of attention for the acceleration of the design of molecules and materials. In this study, a material design protocol based on multimode modeling that combines literature modeling, numerical data collection, textual descriptor design, genetic modeling, experimental validation, first-principles calculation, and theoretical efficiency calculation is proposed, with a case study on designing compatible complex solvent molecules for a halide perovskite film, which is notorious for optoelectronic deactivation under hostile conditions, especially in water. In the multimode modeling design process, the textual descriptors play the central role and store rich literature scientific knowledge, which starts from the construction of a high-dimension literature model based on scientific articles and is realized by a genetic algorithm for materials predictions. The prediction is substantiated by follow-up experiments and first-principles calculations, leading to the successful identification of effective molecular combinations delivering an unprecedented large aqueous photocurrent (increasing by 3 orders of magnitude compared with that of CH3NH3PbI3) and remarkable aqueous stability (improving from 36% to 89% after immersion in water) under the hostile condition. This study provides a practical route via multimode modeling for accelerating the design of molecule-modified and solution-processed materials in a real scenario.

20.
Artículo en Inglés | MEDLINE | ID: mdl-39271171

RESUMEN

OBJECTIVES: The aim of this study was to investigate GPT-3.5 in generating and coding medical documents with International Classification of Diseases (ICD)-10 codes for data augmentation on low-resource labels. MATERIALS AND METHODS: Employing GPT-3.5 we generated and coded 9606 discharge summaries based on lists of ICD-10 code descriptions of patients with infrequent (or generation) codes within the MIMIC-IV dataset. Combined with the baseline training set, this formed an augmented training set. Neural coding models were trained on baseline and augmented data and evaluated on an MIMIC-IV test set. We report micro- and macro-F1 scores on the full codeset, generation codes, and their families. Weak Hierarchical Confusion Matrices determined within-family and outside-of-family coding errors in the latter codesets. The coding performance of GPT-3.5 was evaluated on prompt-guided self-generated data and real MIMIC-IV data. Clinicians evaluated the clinical acceptability of the generated documents. RESULTS: Data augmentation results in slightly lower overall model performance but improves performance for the generation candidate codes and their families, including 1 absent from the baseline training data. Augmented models display lower out-of-family error rates. GPT-3.5 identifies ICD-10 codes by their prompted descriptions but underperforms on real data. Evaluators highlight the correctness of generated concepts while suffering in variety, supporting information, and narrative. DISCUSSION AND CONCLUSION: While GPT-3.5 alone given our prompt setting is unsuitable for ICD-10 coding, it supports data augmentation for training neural models. Augmentation positively affects generation code families but mainly benefits codes with existing examples. Augmentation reduces out-of-family errors. Documents generated by GPT-3.5 state prompted concepts correctly but lack variety, and authenticity in narratives.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA