Benchmarking Large Language Models for Cervical Spondylosis.

Zhang, Boyan; Du, Yueqi; Duan, Wanru; Chen, Zan

Zhang, Boyan; Du, Yueqi; Duan, Wanru; Chen, Zan.

Afiliación

Zhang B; Xuanwu Hospital, Capital Medical University, Beijing, China.
Du Y; Lab of Spinal Cord Injury and Functional Reconstruction, China International Neuroscience Institute, Beijing, China.
Duan W; Xuanwu Hospital, Capital Medical University, Beijing, China.
Chen Z; Lab of Spinal Cord Injury and Functional Reconstruction, China International Neuroscience Institute, Beijing, China.

JMIR Form Res ; 8: e55577, 2024 Aug 05.

Article en En | MEDLINE | ID: mdl-39102674

ABSTRACT

ABSTRACT

Cervical spondylosis is the most common degenerative spinal disorder in modern societies. Patients require a great deal of medical knowledge, and large language models (LLMs) offer patients a novel and convenient tool for accessing medical advice. In this study, we collected the most frequently asked questions by patients with cervical spondylosis in clinical work and internet consultations. The accuracy of the answers provided by LLMs was evaluated and graded by 3 experienced spinal surgeons. Comparative analysis of responses showed that all LLMs could provide satisfactory results, and that among them, GPT-4 had the highest accuracy rate. Variation across each section in all LLMs revealed their ability boundaries and the development direction of artificial intelligence.

Palabras clave

ChatGPT; LLM; cervical spondylosis; large language model; patient

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: JMIR Form Res Año: 2024 Tipo del documento: Article País de afiliación: China Pais de publicación: Canadá

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google