The Role of Artificial Intelligence in Endocrine Management: Assessing ChatGPT's Responses to Prolactinoma Queries.

Senoymak, Mustafa Can; Erbatur, Nuriye Hale; Senoymak, Irem; Firat, Sevde Nur

Senoymak, Mustafa Can; Erbatur, Nuriye Hale; Senoymak, Irem; Firat, Sevde Nur.

Afiliación

Senoymak MC; Department of Endocrinology and Metabolism, University of Health Sciences Sultan, Abdulhamid Han Training and Research Hospital, Istanbul 34668, Turkey.
Erbatur NH; Department of Endocrinology and Metabolism, University of Health Sciences Sultan, Abdulhamid Han Training and Research Hospital, Istanbul 34668, Turkey.
Senoymak I; Family Medicine Department, Usküdar State Hospital, Istanbul 34662, Turkey.
Firat SN; Department of Endocrinology and Metabolism, University of Health Sciences, Ankara Training and Research Hospital, Ankara 06230, Turkey.

J Pers Med ; 14(4)2024 Mar 22.

Article en En | MEDLINE | ID: mdl-38672957

ABSTRACT

ABSTRACT

This research investigates the utility of Chat Generative Pre-trained Transformer (ChatGPT) in addressing patient inquiries related to hyperprolactinemia and prolactinoma. A set of 46 commonly asked questions from patients with prolactinoma were presented to ChatGPT and responses were evaluated for accuracy with a 6-point Likert scale (1 completely inaccurate to 6 completely accurate) and adequacy with a 5-point Likert scale (1 completely inadequate to 5 completely adequate). Two independent endocrinologists assessed the responses, based on international guidelines. Questions were categorized into groups including general information, diagnostic process, treatment process, follow-up, and pregnancy period. The median accuracy score was 6.0 (IQR, 5.4-6.0), and the adequacy score was 4.5 (IQR, 3.5-5.0). The lowest accuracy and adequacy score assigned by both evaluators was two. Significant agreement was observed between the evaluators, demonstrated by a weighted κ of 0.68 (p = 0.08) for accuracy and a κ of 0.66 (p = 0.04) for adequacy. The Kruskal-Wallis tests revealed statistically significant differences among the groups for accuracy (p = 0.005) and adequacy (p = 0.023). The pregnancy period group had the lowest accuracy score and both pregnancy period and follow-up groups had the lowest adequacy score. In conclusion, ChatGPT demonstrated commendable responses in addressing prolactinoma queries; however, certain limitations were observed, particularly in providing accurate information related to the pregnancy period, emphasizing the need for refining its capabilities in medical contexts.

Palabras clave

ChatGPT; artificial intelligence; health literacy; hyperprolactinemia; prolactinoma

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Pers Med Año: 2024 Tipo del documento: Article País de afiliación: Turquía Pais de publicación: Suiza

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google