Prompt engineering to increase GPT3.5's performance on the Plastic Surgery In-Service Exams.
J Plast Reconstr Aesthet Surg
; 98: 158-160, 2024 Sep 05.
Article
en En
| MEDLINE
| ID: mdl-39255523
ABSTRACT
This study assesses ChatGPT's (GPT-3.5) performance on the 2021 ASPS Plastic Surgery In-Service Examination using prompt modifications and Retrieval Augmented Generation (RAG). ChatGPT was instructed to act as a "resident," "attending," or "medical student," and RAG utilized a curated vector database for context. Results showed no significant improvement, with the "resident" prompt yielding the highest accuracy at 54%, and RAG failing to enhance performance, with accuracy remaining at 54.3%. Despite appropriate reasoning when correct, ChatGPT's overall performance fell in the 10th percentile, indicating the need for fine-tuning and more sophisticated approaches to improve AI's utility in complex medical tasks.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Idioma:
En
Revista:
J Plast Reconstr Aesthet Surg
Año:
2024
Tipo del documento:
Article
País de afiliación:
Estados Unidos
Pais de publicación:
Países Bajos