ChatGPT and retinal disease: a cross-sectional study on AI comprehension of clinical guidelines.

Balas, Michael; Mandelcorn, Efrem D; Yan, Peng; Ing, Edsel B; Crawford, Sean A; Arjmand, Parnian

Balas, Michael; Mandelcorn, Efrem D; Yan, Peng; Ing, Edsel B; Crawford, Sean A; Arjmand, Parnian.

Afiliación

Balas M; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
Mandelcorn ED; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada; Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada; University Health Network, University of Toronto, Toronto, ON, Canada; Kensington Eye Institute, Toronto, ON, Canada.
Yan P; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada; Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada; University Health Network, University of Toronto, Toronto, ON, Canada; Kensington Eye Institute, Toronto, ON, Canada.
Ing EB; Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada; Department of Ophthalmology and Visual Sciences, University of Alberta, Edmonton, AB, Canada.
Crawford SA; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada; University Health Network, University of Toronto, Toronto, ON, Canada; Division of Vascular Surgery, Department of Surgery, University of Toronto, Toronto, ON, Canada; Peter Munk Cardiac Centre, Toronto General Hospital, Univer
Arjmand P; Mississauga Retina Institute, Mississauga, ON, Canada. Electronic address: parnian.arjmand@medportal.ca.

Can J Ophthalmol ; 2024 Aug 01.

Article en En | MEDLINE | ID: mdl-39097289

ABSTRACT

ABSTRACT

OBJECTIVE:

To evaluate the performance of an artificial intelligence (AI) large language model, ChatGPT (version 4.0), for common retinal diseases, in accordance with the American Academy of Ophthalmology (AAO) Preferred Practice Pattern (PPP) guidelines.

DESIGN:

A cross-sectional survey study design was employed to compare the responses made by ChatGPT to established clinical guidelines.

PARTICIPANTS:

Responses by the AI were reviewed by a panel of three vitreoretinal specialists for evaluation.

METHODS:

To investigate ChatGPT's comprehension of clinical guidelines, we designed 130 questions covering a broad spectrum of topics within 12 AAO PPP domains of retinal disease These questions were crafted to encompass diagnostic criteria, treatment guidelines, and management strategies, including both medical and surgical aspects of retinal care. A panel of 3 retinal specialists independently evaluated responses on a Likert scale from 1 to 5 based on their relevance, accuracy, and adherence to AAO PPP guidelines. Response readability was evaluated using Flesch Readability Ease and Flesch-Kincaid grade level scores.

RESULTS:

ChatGPT achieved an overall average score of 4.9/5.0, suggesting high alignment with the AAO PPP guidelines. Scores varied across domains, with the lowest in the surgical management of disease. The responses had a low reading ease score and required a college-to-graduate level of comprehension. Identified errors were related to diagnostic criteria, treatment options, and methodological procedures.

CONCLUSION:

ChatGPT 4.0 demonstrated significant potential in generating guideline-concordant responses, particularly for common medical retinal diseases. However, its performance slightly decreased in surgical retina, highlighting the ongoing need for clinician input, further model refinement, and improved comprehensibility.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Can J Ophthalmol Año: 2024 Tipo del documento: Article País de afiliación: Canadá Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google