A study of generative large language model for medical research and healthcare.

Peng, Cheng; Yang, Xi; Chen, Aokun; Smith, Kaleb E; PourNejatian, Nima; Costa, Anthony B; Martin, Cheryl; Flores, Mona G; Zhang, Ying; Magoc, Tanja; Lipori, Gloria; Mitchell, Duane A; Ospina, Naykky S; Ahmed, Mustafa M; Hogan, William R; Shenkman, Elizabeth A; Guo, Yi; Bian, Jiang; Wu, Yonghui

Peng, Cheng; Yang, Xi; Chen, Aokun; Smith, Kaleb E; PourNejatian, Nima; Costa, Anthony B; Martin, Cheryl; Flores, Mona G; Zhang, Ying; Magoc, Tanja; Lipori, Gloria; Mitchell, Duane A; Ospina, Naykky S; Ahmed, Mustafa M; Hogan, William R; Shenkman, Elizabeth A; Guo, Yi; Bian, Jiang; Wu, Yonghui.

Afiliación

Peng C; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
Yang X; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
Chen A; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA.
Smith KE; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
PourNejatian N; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA.
Costa AB; NVIDIA, Santa Clara, CA, USA.
Martin C; NVIDIA, Santa Clara, CA, USA.
Flores MG; NVIDIA, Santa Clara, CA, USA.
Zhang Y; NVIDIA, Santa Clara, CA, USA.
Magoc T; NVIDIA, Santa Clara, CA, USA.
Lipori G; Research Computing, University of Florida, Gainesville, FL, USA.
Mitchell DA; Integrated Data Repository Research Services, University of Florida, Gainesville, FL, USA.
Ospina NS; Integrated Data Repository Research Services, University of Florida, Gainesville, FL, USA.
Ahmed MM; Lillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of Florida, Gainesville, FL, USA.
Hogan WR; Lillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of Florida, Gainesville, FL, USA.
Shenkman EA; Division of Endocrinology, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA.
Guo Y; Division of Cardiovascular Medicine, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA.
Bian J; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
Wu Y; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.

NPJ Digit Med ; 6(1): 210, 2023 Nov 16.

Article en En | MEDLINE | ID: mdl-37973919

RESUMEN

There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians' Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: NPJ Digit Med Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google