Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Am Coll Radiol ; 21(7): 1072-1078, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38224925

RESUMEN

BACKGROUND AND PURPOSE: Large language models (LLMs) have seen explosive growth, but their potential role in medical applications remains underexplored. Our study investigates the capability of LLMs to predict the most appropriate imaging study for specific clinical presentations in various subspecialty areas in radiology. METHODS AND MATERIALS: Chat Generative Pretrained Transformer (ChatGPT), by OpenAI and Glass AI by Glass Health were tested on 1,075 clinical scenarios from 11 ACR expert panels to determine the most appropriate imaging study, benchmarked against the ACR Appropriateness Criteria. Two responses per clinical presentation were generated and averaged for the final clinical presentation score. Clinical presentation scores for each topic area were averaged as its final score. The average of the topic scores within a panel determined the final score of each panel. LLM responses were on a scale of 0 to 3. Partial scores were given for nonspecific answers. Pearson correlation coefficient (R-value) was calculated for each panel to determine a context-specific performance. RESULTS: Glass AI scored significantly higher than ChatGPT (2.32 ± 0.67 versus 2.08 ± 0.74, P = .002). Both LLMs performed the best in the Polytrauma, Breast, and Vascular panels, and performed the worst in the Neurologic, Musculoskeletal, and Cardiac panels. Glass AI outperformed ChatGPT in 10 of 11 panels, except Obstetrics and Gynecology. Maximum agreement was in the Pediatrics, Neurologic, and Thoracic panels, and the most disagreement occurred in the Vascular, Breast, and Urologic panels. CONCLUSION: LLMs can be used to predict imaging studies, with Glass AI's superior performance indicating the benefits of extra medical-text training. This supports the potential of LLMs in radiologic decision making.


Asunto(s)
Radiología , Humanos , Toma de Decisiones Clínicas
2.
J Am Coll Radiol ; 20(10): 1004-1009, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37423349

RESUMEN

PURPOSE: Large language models (LLMs) have demonstrated a level of competency within the medical field. The aim of this study was to explore the ability of LLMs to predict the best neuroradiologic imaging modality given specific clinical presentations. In addition, the authors seek to determine if LLMs can outperform an experienced neuroradiologist in this regard. METHODS: ChatGPT and Glass AI, a health care-based LLM by Glass Health, were used. ChatGPT was prompted to rank the three best neuroimaging modalities while taking the best responses from Glass AI and the neuroradiologist. The responses were compared with the ACR Appropriateness Criteria for 147 conditions. Clinical scenarios were passed into each LLM twice to account for stochasticity. Each output was scored out of 3 on the basis of the criteria. Partial scores were given for nonspecific answers. RESULTS: ChatGPT and Glass AI scored 1.75 and 1.83, respectively, with no statistically significant difference. The neuroradiologist scored 2.20, significantly outperforming both LLMs. ChatGPT was also found to be the more inconsistent of the two LLMs, with the score difference between both outputs being statistically significant. Additionally, scores between different ranks output by ChatGPT were statistically significant. CONCLUSIONS: LLMs perform well in selecting appropriate neuroradiologic imaging procedures when prompted with specific clinical scenarios. ChatGPT performed the same as Glass AI, suggesting that with medical text training, ChatGPT could significantly improve its function in this application. LLMs did not outperform an experienced neuroradiologist, indicating the need for continued improvement in the medical context.


Asunto(s)
Lenguaje , Neuroimagen , Humanos , Radiólogos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA