Patient-Friendly Discharge Summaries in Korea Based on ChatGPT: Software Development and Validation.

Kim, Hanjae; Jin, Hee Min; Jung, Yoon Bin; You, Seng Chan

Kim, Hanjae; Jin, Hee Min; Jung, Yoon Bin; You, Seng Chan.

Afiliación

Kim H; College of Nursing, Yonsei University, Seoul, Korea.
Jin HM; Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Korea.
Jung YB; Department of Surgery, Yonsei University College of Medicine, Seoul, Korea.
You SC; Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Korea. Chandryou@yuhs.ac.

J Korean Med Sci ; 39(16): e148, 2024 Apr 29.

Article en En | MEDLINE | ID: mdl-38685890

ABSTRACT

ABSTRACT

BACKGROUND:

Although discharge summaries in patient-friendly language can enhance patient comprehension and satisfaction, they can also increase medical staff workload. Using a large language model, we developed and validated software that generates a patient-friendly discharge summary.

METHODS:

We developed and tested the software using 100 discharge summary documents, 50 for patients with myocardial infarction and 50 for patients treated in the Department of General Surgery. For each document, three new summaries were generated using three different prompting methods (Zero-shot, One-shot, and Few-shot) and graded using a 5-point Likert Scale regarding factuality, comprehensiveness, usability, ease, and fluency. We compared the effects of different prompting methods and assessed the relationship between input length and output quality.

RESULTS:

The mean overall scores differed across prompting methods (4.19 ± 0.36 in Few-shot, 4.11 ± 0.36 in One-shot, and 3.73 ± 0.44 in Zero-shot; P < 0.001). Post-hoc analysis indicated that the scores were higher with Few-shot and One-shot prompts than in zero-shot prompts, whereas there was no significant difference between Few-shot and One-shot prompts. The overall proportion of outputs that scored ≥ 4 was 77.0% (95% confidence interval 68.8-85.3%), 70.0% (95% confidence interval [CI], 61.0-79.0%), and 32.0% (95% CI, 22.9-41.1%) with Few-shot, One-shot, and Zero-shot prompts, respectively. The mean factuality score was 4.19 ± 0.60 with Few-shot, 4.20 ± 0.55 with One-shot, and 3.82 ± 0.57 with Zero-shot prompts. Input length and the overall score showed negative correlations in the Zero-shot (r = -0.437, P < 0.001) and One-shot (r = -0.327, P < 0.001) tests but not in the Few-shot (r = -0.050, P = 0.625) tests.

CONCLUSION:

Large-language models utilizing Few-shot prompts generally produce acceptable discharge summaries without significant misinformation. Our research highlights the potential of such models in creating patient-friendly discharge summaries for Korean patients to support patient-centered care.

Asunto(s)

Alta del Paciente; Programas Informáticos; Humanos; República de Corea; Infarto del Miocardio/diagnóstico; Satisfacción del Paciente; Resumen del Alta del Paciente; Registros Electrónicos de Salud

Palabras clave

Artificial Intelligence; ChatGPT; Documentation; Large Language Model; Patient Discharge Summaries; Patient-Centered Care

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Alta del Paciente / Programas Informáticos Límite: Humans País/Región como asunto: Asia Idioma: En Revista: J Korean Med Sci Asunto de la revista: MEDICINA Año: 2024 Tipo del documento: Article Pais de publicación:

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google