Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Thromb Res ; 242: 109121, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39213896

RESUMEN

In this review, we embark on a comprehensive exploration of venous thromboembolism (VTE) in the context of medical history and its current practice within medicine. We delve into the landscape of artificial intelligence (AI), exploring its present utility and envisioning its transformative roles within VTE management, from prevention to screening and beyond. Central to our discourse is a forward-looking perspective on the integration of AI within VTE in medicine, advocating for rigorous study design, robust validation processes, and meticulous statistical analysis to gauge the efficacy of AI applications. We further illuminate the potential of large language models and generative AI in revolutionizing VTE care, while acknowledging their inherent limitations and proposing innovative solutions to overcome challenges related to data availability and integrity, including the strategic use of synthetic data. The critical importance of navigating ethical, legal, and privacy concerns associated with AI is underscored, alongside the imperative for comprehensive governance and policy frameworks to regulate its deployment in VTE treatment. We conclude on a note of cautious optimism, where we highlight the significance of proactively addressing the myriad challenges that accompany the advent of AI in healthcare. Through diligent design, stringent validation, extensive education, and prudent regulation, we can harness AI's potential to significantly enhance our understanding and management of VTE. As we stand on the cusp of a new era, our commitment to these principles will be instrumental in ensuring that the promise of AI is fully realized within the realm of VTE care.


Asunto(s)
Inteligencia Artificial , Aprendizaje Automático , Tromboembolia Venosa , Tromboembolia Venosa/terapia , Tromboembolia Venosa/diagnóstico , Humanos
2.
Front Comput Neurosci ; 18: 1388166, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39114083

RESUMEN

A good theory of mathematical beauty is more practical than any current observation, as new predictions about physical reality can be self-consistently verified. This belief applies to the current status of understanding deep neural networks including large language models and even the biological intelligence. Toy models provide a metaphor of physical reality, allowing mathematically formulating the reality (i.e., the so-called theory), which can be updated as more conjectures are justified or refuted. One does not need to present all details in a model, but rather, more abstract models are constructed, as complex systems such as the brains or deep networks have many sloppy dimensions but much less stiff dimensions that strongly impact macroscopic observables. This type of bottom-up mechanistic modeling is still promising in the modern era of understanding the natural or artificial intelligence. Here, we shed light on eight challenges in developing theory of intelligence following this theoretical paradigm. Theses challenges are representation learning, generalization, adversarial robustness, continual learning, causal learning, internal model of the brain, next-token prediction, and the mechanics of subjective experience.

3.
Front Med (Lausanne) ; 11: 1380148, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38966538

RESUMEN

Background: The use of large language models (LLM) has recently gained popularity in diverse areas, including answering questions posted by patients as well as medical professionals. Objective: To evaluate the performance and limitations of LLMs in providing the correct diagnosis for a complex clinical case. Design: Seventy-five consecutive clinical cases were selected from the Massachusetts General Hospital Case Records, and differential diagnoses were generated by OpenAI's GPT3.5 and 4 models. Results: The mean number of diagnoses provided by the Massachusetts General Hospital case discussants was 16.77, by GPT3.5 30 and by GPT4 15.45 (p < 0.0001). GPT4 was more frequently able to list the correct diagnosis as first (22% versus 20% with GPT3.5, p = 0.86), provide the correct diagnosis among the top three generated diagnoses (42% versus 24%, p = 0.075). GPT4 was better at providing the correct diagnosis, when the different diagnoses were classified into groups according to the medical specialty and include the correct diagnosis at any point in the differential list (68% versus 48%, p = 0.0063). GPT4 provided a differential list that was more similar to the list provided by the case discussants than GPT3.5 (Jaccard Similarity Index 0.22 versus 0.12, p = 0.001). Inclusion of the correct diagnosis in the generated differential was correlated with PubMed articles matching the diagnosis (OR 1.40, 95% CI 1.25-1.56 for GPT3.5, OR 1.25, 95% CI 1.13-1.40 for GPT4), but not with disease incidence. Conclusions and relevance: The GPT4 model was able to generate a differential diagnosis list with the correct diagnosis in approximately two thirds of cases, but the most likely diagnosis was often incorrect for both models. In its current state, this tool can at most be used as an aid to expand on potential diagnostic considerations for a case, and future LLMs should be trained which account for the discrepancy between disease incidence and availability in the literature.

5.
World J Urol ; 42(1): 455, 2024 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-39073590

RESUMEN

PURPOSE: Large language models (LLMs) are a form of artificial intelligence (AI) that uses deep learning techniques to understand, summarize and generate content. The potential benefits of LLMs in healthcare is predicted to be immense. The objective of this study was to examine the quality of patient information leaflets (PILs) produced by 3 LLMs on urological topics. METHODS: Prompts were created to generate PILs from 3 LLMs: ChatGPT-4, PaLM 2 (Google Bard) and Llama 2 (Meta) across four urology topics (circumcision, nephrectomy, overactive bladder syndrome, and transurethral resection of the prostate). PILs were evaluated using a quality assessment checklist. PIL readability was assessed by the Average Reading Level Consensus Calculator. RESULTS: PILs generated by PaLM 2 had the highest overall average quality score (3.58), followed by Llama 2 (3.34) and ChatGPT-4 (3.08). PaLM 2 generated PILs were of the highest quality in all topics except TURP and was the only LLM to include images. Medical inaccuracies were present in all generated content including instances of significant error. Readability analysis identified PaLM 2 generated PILs as the simplest (age 14-15 average reading level). Llama 2 PILs were the most difficult (age 16-17 average). CONCLUSION: While LLMs can generate PILs that may help reduce healthcare professional workload, generated content requires clinician input for accuracy and inclusion of health literacy aids, such as images. LLM-generated PILs were above the average reading level for adults, necessitating improvement in LLM algorithms and/or prompt design. How satisfied patients are to LLM-generated PILs remains to be evaluated.


Asunto(s)
Inteligencia Artificial , Urología , Humanos , Educación del Paciente como Asunto/métodos , Lenguaje , Enfermedades Urológicas/cirugía
6.
Front Dement ; 3: 1385303, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39081594

RESUMEN

Introduction: Dementia is a progressive neurodegenerative disorder that affects cognitive abilities including memory, reasoning, and communication skills, leading to gradual decline in daily activities and social engagement. In light of the recent advent of Large Language Models (LLMs) such as ChatGPT, this paper aims to thoroughly analyse their potential applications and usefulness in dementia care and research. Method: To this end, we offer an introduction into LLMs, outlining the key features, capabilities, limitations, potential risks, and practical considerations for deployment as easy-to-use software (e.g., smartphone apps). We then explore various domains related to dementia, identifying opportunities for LLMs to enhance understanding, diagnostics, and treatment, with a broader emphasis on improving patient care. For each domain, the specific contributions of LLMs are examined, such as their ability to engage users in meaningful conversations, deliver personalized support, and offer cognitive enrichment. Potential benefits encompass improved social interaction, enhanced cognitive functioning, increased emotional well-being, and reduced caregiver burden. The deployment of LLMs in caregiving frameworks also raises a number of concerns and considerations. These include privacy and safety concerns, the need for empirical validation, user-centered design, adaptation to the user's unique needs, and the integration of multimodal inputs to create more immersive and personalized experiences. Additionally, ethical guidelines and privacy protocols must be established to ensure responsible and ethical deployment of LLMs. Results: We report the results on a questionnaire filled in by people with dementia (PwD) and their supporters wherein we surveyed the usefulness of different application scenarios of LLMs as well as the features that LLM-powered apps should have. Both PwD and supporters were largely positive regarding the prospect of LLMs in care, although concerns were raised regarding bias, data privacy and transparency. Discussion: Overall, this review corroborates the promising utilization of LLMs to positively impact dementia care by boosting cognitive abilities, enriching social interaction, and supporting caregivers. The findings underscore the importance of further research and development in this field to fully harness the benefits of LLMs and maximize their potential for improving the lives of individuals living with dementia.

7.
Diagnostics (Basel) ; 14(14)2024 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-39061677

RESUMEN

BACKGROUND AND OBJECTIVES: Integrating large language models (LLMs) such as GPT-4 Turbo into diagnostic imaging faces a significant challenge, with current misdiagnosis rates ranging from 30-50%. This study evaluates how prompt engineering and confidence thresholds can improve diagnostic accuracy in neuroradiology. METHODS: We analyze 751 neuroradiology cases from the American Journal of Neuroradiology using GPT-4 Turbo with customized prompts to improve diagnostic precision. RESULTS: Initially, GPT-4 Turbo achieved a baseline diagnostic accuracy of 55.1%. By reformatting responses to list five diagnostic candidates and applying a 90% confidence threshold, the highest precision of the diagnosis increased to 72.9%, with the candidate list providing the correct diagnosis at 85.9%, reducing the misdiagnosis rate to 14.1%. However, this threshold reduced the number of cases that responded. CONCLUSIONS: Strategic prompt engineering and high confidence thresholds significantly reduce misdiagnoses and improve the precision of the LLM diagnostic in neuroradiology. More research is needed to optimize these approaches for broader clinical implementation, balancing accuracy and utility.

8.
Jpn J Radiol ; 42(8): 918-926, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38733472

RESUMEN

PURPOSE: To assess the performance of GPT-4 Turbo with Vision (GPT-4TV), OpenAI's latest multimodal large language model, by comparing its ability to process both text and image inputs with that of the text-only GPT-4 Turbo (GPT-4 T) in the context of the Japan Diagnostic Radiology Board Examination (JDRBE). MATERIALS AND METHODS: The dataset comprised questions from JDRBE 2021 and 2023. A total of six board-certified diagnostic radiologists discussed the questions and provided ground-truth answers by consulting relevant literature as necessary. The following questions were excluded: those lacking associated images, those with no unanimous agreement on answers, and those including images rejected by the OpenAI application programming interface. The inputs for GPT-4TV included both text and images, whereas those for GPT-4 T were entirely text. Both models were deployed on the dataset, and their performance was compared using McNemar's exact test. The radiological credibility of the responses was assessed by two diagnostic radiologists through the assignment of legitimacy scores on a five-point Likert scale. These scores were subsequently used to compare model performance using Wilcoxon's signed-rank test. RESULTS: The dataset comprised 139 questions. GPT-4TV correctly answered 62 questions (45%), whereas GPT-4 T correctly answered 57 questions (41%). A statistical analysis found no significant performance difference between the two models (P = 0.44). The GPT-4TV responses received significantly lower legitimacy scores from both radiologists than the GPT-4 T responses. CONCLUSION: No significant enhancement in accuracy was observed when using GPT-4TV with image input compared with that of using text-only GPT-4 T for JDRBE questions.


Asunto(s)
Radiología , Humanos , Japón , Radiología/educación , Consejos de Especialidades , Competencia Clínica , Evaluación Educacional/métodos
9.
Radiol Artif Intell ; 6(4): e230364, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38717292

RESUMEN

Purpose To assess the performance of a local open-source large language model (LLM) in various information extraction tasks from real-life emergency brain MRI reports. Materials and Methods All consecutive emergency brain MRI reports written in 2022 from a French quaternary center were retrospectively reviewed. Two radiologists identified MRI scans that were performed in the emergency department for headaches. Four radiologists scored the reports' conclusions as either normal or abnormal. Abnormalities were labeled as either headache-causing or incidental. Vicuna (LMSYS Org), an open-source LLM, performed the same tasks. Vicuna's performance metrics were evaluated using the radiologists' consensus as the reference standard. Results Among the 2398 reports during the study period, radiologists identified 595 that included headaches in the indication (median age of patients, 35 years [IQR, 26-51 years]; 68% [403 of 595] women). A positive finding was reported in 227 of 595 (38%) cases, 136 of which could explain the headache. The LLM had a sensitivity of 98.0% (95% CI: 96.5, 99.0) and specificity of 99.3% (95% CI: 98.8, 99.7) for detecting the presence of headache in the clinical context, a sensitivity of 99.4% (95% CI: 98.3, 99.9) and specificity of 98.6% (95% CI: 92.2, 100.0) for the use of contrast medium injection, a sensitivity of 96.0% (95% CI: 92.5, 98.2) and specificity of 98.9% (95% CI: 97.2, 99.7) for study categorization as either normal or abnormal, and a sensitivity of 88.2% (95% CI: 81.6, 93.1) and specificity of 73% (95% CI: 62, 81) for causal inference between MRI findings and headache. Conclusion An open-source LLM was able to extract information from free-text radiology reports with excellent accuracy without requiring further training. Keywords: Large Language Model (LLM), Generative Pretrained Transformers (GPT), Open Source, Information Extraction, Report, Brain, MRI Supplemental material is available for this article. Published under a CC BY 4.0 license. See also the commentary by Akinci D'Antonoli and Bluethgen in this issue.


Asunto(s)
Cefalea , Imagen por Resonancia Magnética , Procesamiento de Lenguaje Natural , Humanos , Femenino , Adulto , Masculino , Persona de Mediana Edad , Imagen por Resonancia Magnética/métodos , Estudios Retrospectivos , Cefalea/diagnóstico por imagen , Cefalea/diagnóstico , Sensibilidad y Especificidad , Servicio de Urgencia en Hospital , Encéfalo/diagnóstico por imagen , Encéfalo/patología
10.
Sci Rep ; 14(1): 5224, 2024 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-38433238

RESUMEN

Large language models (LLMs) have the potential to transform our lives and work through the content they generate, known as AI-Generated Content (AIGC). To harness this transformation, we need to understand the limitations of LLMs. Here, we investigate the bias of AIGC produced by seven representative LLMs, including ChatGPT and LLaMA. We collect news articles from The New York Times and Reuters, both known for their dedication to provide unbiased news. We then apply each examined LLM to generate news content with headlines of these news articles as prompts, and evaluate the gender and racial biases of the AIGC produced by the LLM by comparing the AIGC and the original news articles. We further analyze the gender bias of each LLM under biased prompts by adding gender-biased messages to prompts constructed from these news headlines. Our study reveals that the AIGC produced by each examined LLM demonstrates substantial gender and racial biases. Moreover, the AIGC generated by each LLM exhibits notable discrimination against females and individuals of the Black race. Among the LLMs, the AIGC generated by ChatGPT demonstrates the lowest level of bias, and ChatGPT is the sole model capable of declining content generation when provided with biased prompts.


Asunto(s)
Insuficiencia de la Válvula Aórtica , Camélidos del Nuevo Mundo , Humanos , Femenino , Masculino , Animales , Sexismo , Sesgo , Lenguaje
11.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38314912

RESUMEN

Increasing volumes of biomedical data are amassing in databases. Large-scale analyses of these data have wide-ranging applications in biology and medicine. Such analyses require tools to characterize and process entries at scale. However, existing tools, mainly centered on extracting predefined fields, often fail to comprehensively process database entries or correct evident errors-a task humans can easily perform. These tools also lack the ability to reason like domain experts, hindering their robustness and analytical depth. Recent advances with large language models (LLMs) provide a fundamentally new way to query databases. But while a tool such as ChatGPT is adept at answering questions about manually input records, challenges arise when scaling up this process. First, interactions with the LLM need to be automated. Second, limitations on input length may require a record pruning or summarization pre-processing step. Third, to behave reliably as desired, the LLM needs either well-designed, short, 'few-shot' examples, or fine-tuning based on a larger set of well-curated examples. Here, we report ChIP-GPT, based on fine-tuning of the generative pre-trained transformer (GPT) model Llama and on a program prompting the model iteratively and handling its generation of answer text. This model is designed to extract metadata from the Sequence Read Archive, emphasizing the identification of chromatin immunoprecipitation (ChIP) targets and cell lines. When trained with 100 examples, ChIP-GPT demonstrates 90-94% accuracy. Notably, it can seamlessly extract data from records with typos or absent field labels. Our proposed method is easily adaptable to customized questions and different databases.


Asunto(s)
Medicina , Humanos , Línea Celular , Inmunoprecipitación de Cromatina , Bases de Datos Factuales , Lenguaje
12.
Cell Rep Methods ; 4(2): 100707, 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-38325383

RESUMEN

Alternative polyadenylation (APA) is a key post-transcriptional regulatory mechanism; yet, its regulation and impact on human diseases remain understudied. Existing bulk RNA sequencing (RNA-seq)-based APA methods predominantly rely on predefined annotations, severely impacting their ability to decode novel tissue- and disease-specific APA changes. Furthermore, they only account for the most proximal and distal cleavage and polyadenylation sites (C/PASs). Deconvoluting overlapping C/PASs and the inherent noisy 3' UTR coverage in bulk RNA-seq data pose additional challenges. To overcome these limitations, we introduce PolyAMiner-Bulk, an attention-based deep learning algorithm that accurately recapitulates C/PAS sequence grammar, resolves overlapping C/PASs, captures non-proximal-to-distal APA changes, and generates visualizations to illustrate APA dynamics. Evaluation on multiple datasets strongly evinces the performance merit of PolyAMiner-Bulk, accurately identifying more APA changes compared with other methods. With the growing importance of APA and the abundance of bulk RNA-seq data, PolyAMiner-Bulk establishes a robust paradigm of APA analysis.


Asunto(s)
Aprendizaje Profundo , Poliadenilación , Humanos , Poliadenilación/genética , RNA-Seq , ARN , Análisis de Secuencia de ARN/métodos , Algoritmos
13.
J Neurol Sci ; 453: 120803, 2023 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-37742349
14.
Front Big Data ; 6: 1224976, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37680954

RESUMEN

ChatGPT, a new language model developed by OpenAI, has garnered significant attention in various fields since its release. This literature review provides an overview of early ChatGPT literature across multiple disciplines, exploring its applications, limitations, and ethical considerations. The review encompasses Scopus-indexed publications from November 2022 to April 2023 and includes 156 articles related to ChatGPT. The findings reveal a predominance of negative sentiment across disciplines, though subject-specific attitudes must be considered. The review highlights the implications of ChatGPT in many fields including healthcare, raising concerns about employment opportunities and ethical considerations. While ChatGPT holds promise for improved communication, further research is needed to address its capabilities and limitations. This literature review provides insights into early research on ChatGPT, informing future investigations and practical applications of chatbot technology, as well as development and usage of generative AI.

15.
J Hand Surg Eur Vol ; 48(8): 819-822, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37417005

RESUMEN

Much has been written about the concerns surrounding artificial intelligence (AI). This article looks positively at how AI can enhance communication and academic skills, including teaching and research. The article explains what AI, Generative Pre-trained Transformer (GPT), and chat-GPT are and highlights a few AI-based tools that are currently in use to improve communication and academic skills. It also mentions potential AI problems, such as a lack of personalization, societal biases, and privacy concerns. The future lies in the training of hand surgeons to master the skill of precise communication and academia using AI tools.


Asunto(s)
Inteligencia Artificial , Cirujanos , Humanos , Escolaridad , Comunicación , Escritura
16.
Am J Bioeth ; 23(10): 8-16, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37440696

RESUMEN

In the last several months, several major disciplines have started their initial reckoning with what ChatGPT and other Large Language Models (LLMs) mean for them - law, medicine, business among other professions. With a heavy dose of humility, given how fast the technology is moving and how uncertain its social implications are, this article attempts to give some early tentative thoughts on what ChatGPT might mean for bioethics. I will first argue that many bioethics issues raised by ChatGPT are similar to those raised by current medical AI - built into devices, decision support tools, data analytics, etc. These include issues of data ownership, consent for data use, data representativeness and bias, and privacy. I describe how these familiar issues appear somewhat differently in the ChatGPT context, but much of the existing bioethical thinking on these issues provides a strong starting point. There are, however, a few "new-ish" issues I highlight - by new-ish I mean issues that while perhaps not truly new seem much more important for it than other forms of medical AI. These include issues about informed consent and the right to know we are dealing with an AI, the problem of medical deepfakes, the risk of oligopoly and inequitable access related to foundational models, environmental effects, and on the positive side opportunities for the democratization of knowledge and empowering patients. I also discuss how races towards dominance (between large companies and between the U.S. and geopolitical rivals like China) risk sidelining ethics.

17.
Int Forum Allergy Rhinol ; 13(12): 2231-2234, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37260081

RESUMEN

KEY POINTS: GPT-4 is an AI language model that can answer basic questions about rhinologic disease. Vetting is needed before AI models can be safely integrated into otolarygologic patient care.


Asunto(s)
Hipersensibilidad , Rinitis , Sinusitis , Humanos , Rinitis/diagnóstico , Rinitis/terapia , Sinusitis/diagnóstico , Sinusitis/terapia , Consenso , Enfermedad Crónica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA