Calibrating machine learning approaches for probability estimation: A comprehensive comparison.

Ojeda, Francisco M; Jansen, Max L; Thiéry, Alexandre; Blankenberg, Stefan; Weimar, Christian; Schmid, Matthias; Ziegler, Andreas

Ojeda, Francisco M; Jansen, Max L; Thiéry, Alexandre; Blankenberg, Stefan; Weimar, Christian; Schmid, Matthias; Ziegler, Andreas.

Afiliación

Ojeda FM; Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Jansen ML; Centre for Population Health Innovation (POINT), University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Thiéry A; Cardio-CARE, Medizincampus Davos, Davos, Switzerland.
Blankenberg S; Cardio-CARE, Medizincampus Davos, Davos, Switzerland.
Weimar C; Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Schmid M; Centre for Population Health Innovation (POINT), University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Ziegler A; German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany.

Stat Med ; 42(29): 5451-5478, 2023 12 20.

Article en En | MEDLINE | ID: mdl-37849356

RESUMEN

Statistical prediction models have gained popularity in applied research. One challenge is the transfer of the prediction model to a different population which may be structurally different from the model for which it has been developed. An adaptation to the new population can be achieved by calibrating the model to the characteristics of the target population, for which numerous calibration techniques exist. In view of this diversity, we performed a systematic evaluation of various popular calibration approaches used by the statistical and the machine learning communities for estimating two-class probabilities. In this work, we first provide a review of the literature and, second, present the results of a comprehensive simulation study. The calibration approaches are compared with respect to their empirical properties and relationships, their ability to generalize precise probability estimates to external populations and their availability in terms of easy-to-use software implementations. Third, we provide code from real data analysis allowing its application by researchers. Logistic calibration and beta calibration, which estimate an intercept plus one and two slope parameters, respectively, consistently showed the best results in the simulation studies. Calibration on logit transformed probability estimates generally outperformed calibration methods on nontransformed estimates. In case of structural differences between training and validation data, re-estimation of the entire prediction model should be outweighted against sample size of the validation data. We recommend regression-based calibration approaches using transformed probability estimates, where at least one slope is estimated in addition to an intercept for updating probability estimates in validation studies.

Asunto(s)

Aprendizaje Automático; Modelos Estadísticos; Humanos; Modelos Logísticos; Programas Informáticos; Probabilidad

Palabras clave

calibration; logistic regression; machine learning; probability estimation; probability machine; updating

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Modelos Estadísticos / Aprendizaje Automático Límite: Humans Idioma: En Revista: Stat Med Año: 2023 Tipo del documento: Article País de afiliación: Alemania Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google