Búsqueda | Portal Regional de la BVS

External validation of prognostic models for critically ill patients required substantial sample sizes.

Peek, N; Arts, D G T; Bosman, R J; van der Voort, P H J; de Keizer, N F.

J Clin Epidemiol ; 60(5): 491-501, 2007 May.

Artículo en Inglés | MEDLINE | ID: mdl-17419960

RESUMEN

OBJECTIVE: To investigate the behavior of predictive performance measures that are commonly used in external validation of prognostic models for outcome at intensive care units (ICUs). STUDY DESIGN AND SETTING: Four prognostic models (Simplified Acute Physiology Score II, the Acute Physiology and Chronic Health Evaluation II, and the Mortality Probability Models II) were evaluated in the Dutch National Intensive Care Evaluation registry database. For each model discrimination (AUC), accuracy (Brier score), and two calibration measures were assessed on data from 41,239 ICU admissions. This validation procedure was repeated with smaller subsamples randomly drawn from the database, and the results were compared with those obtained on the entire data set. RESULTS: Differences in performance between the models were small. The AUC and Brier score showed large variation with small samples. Standard errors of AUC values were accurate but the power to detect differences in performance was low. Calibration tests were extremely sensitive to sample size. Direct comparison of performance, without statistical analysis, was unreliable with either measure. CONCLUSION: Substantial sample sizes are required for performance assessment and model comparison in external validation. Calibration statistics and significance tests should not be used in these settings. Instead, a simple customization method to repair lack-of-fit problems is recommended.

Asunto(s)

Enfermedad Crítica/mortalidad , Unidades de Cuidados Intensivos , Evaluación de Resultado en la Atención de Salud/métodos , Anciano , Calibración , Métodos Epidemiológicos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Países Bajos/epidemiología , Pronóstico

Reliability and accuracy of Sequential Organ Failure Assessment (SOFA) scoring.

Arts, D G T; de Keizer, N F; Vroom, M B; de Jonge, E.

Crit Care Med ; 33(9): 1988-93, 2005 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-16148470

RESUMEN

OBJECTIVE: The Sequential Organ Failure Assessment (SOFA) score was developed to quantify the severity of patients' illness, based on the degree of organ dysfunction. This study aimed to evaluate the accuracy and the reliability of SOFA scoring. DESIGN: Prospective study. SETTING: Adult intensive care unit (ICU) in a tertiary academic center. SUBJECTS: Thirty randomly selected patient cases and 20 ICU physicians. MEASUREMENTS AND MAIN RESULTS: Each physician scored 15 patient cases. The intraclass correlation coefficient was .889 for the total SOFA score. The weighted kappa values were moderate (0.552) for the central nervous system, good (0.634) for the respiratory system, and almost perfect (>0.8) for the other organ systems. To assess accuracy, the physicians' scores were compared with a gold standard based on consensus of two experts. The total SOFA score was correct in 53% (n = 158) of the cases. The mean of the absolute deviations of the recorded total SOFA scores from the gold standard total SOFA scores was 0.82. Common causes of errors were inattention, calculation errors, and misinterpretation of scoring rules. CONCLUSIONS: The results of this study indicate that the reliability and the accuracy of SOFA scoring among physicians are good. We advise implementation of additional measures to further improve reliability and accuracy of SOFA scoring.

Asunto(s)

Enfermedad Crítica , Índice de Severidad de la Enfermedad , Coma/complicaciones , Femenino , Humanos , Unidades de Cuidados Intensivos , Masculino , Persona de Mediana Edad , Estudios Prospectivos

Methods for evaluation of medical terminological systems--a literature review and a case study.

Arts, D G T; Cornet, R; de Jonge, E; de Keizer, N F.

Methods Inf Med ; 44(5): 616-25, 2005.

Artículo en Inglés | MEDLINE | ID: mdl-16400369

RESUMEN

OBJECTIVES: The usability of terminological systems (TSs) strongly depends on the coverage and correctness of their content. The objective of this study was to create a literature overview of aspects related to the content of TSs and of methods for the evaluation of the content of TSs. The extent to which these methods overlap or complement each other is investigated. METHODS: We reviewed literature and composed definitions for aspects of the evaluation of the content of TSs. Of the methods described in literature three were selected: 1) Concept matching in which two samples of concepts representing a) documentation of reasons for admission in daily care practice and b) aggregation of patient groups for research, are looked up in the TS in order to assess its coverage; 2) Formal algorithmic evaluation in which reasoning on the formally represented content is used to detect inconsistencies; and 3) Expert review in which a random sample of concepts are checked for incorrect and incomplete terms and relations. These evaluation methods were applied in a case study on the locally developed TS DICE (Diagnoses for Intensive Care Evaluation). RESULTS: None of the applied methods covered all the aspects of the content of a TS. The results of concept matching differed for the two use cases (63% vs. 52% perfect matches). Expert review revealed many more errors and incompleteness than formal algorithmic evaluation. CONCLUSIONS: To evaluate the content of a TS, using a combination of evaluation methods is preferable. Different representative samples, reflecting the uses of TSs, lead to different results for concept matching. Expert review appears to be very valuable, but time consuming. Formal algorithmic evaluation has the potential to decrease the workload of human reviewers but detects only logical inconsistencies. Further research is required to exploit the potentials of formal algorithmic evaluation.

Asunto(s)

Estudios de Evaluación como Asunto , Informática Médica/normas , Terminología como Asunto , Países Bajos , Estudios de Casos Organizacionales

Comparison of methods for evaluation of medical terminological systems.

Arts, D G T; Cornet, R; De Jonge, E; De Keizer, N F.

AMIA Annu Symp Proc ; : 779, 2003.

Artículo en Inglés | MEDLINE | ID: mdl-14728284

RESUMEN

The importance of terminological systems (TS) for the medical domain is widely recognized. The usability of such a system depends primarily on its content. We have designed four methods to evaluate the content of TS and applied them in a case study

Asunto(s)

Estudios de Evaluación como Asunto , Vocabulario Controlado , Humanos , Almacenamiento y Recuperación de la Información , Sistemas de Registros Médicos Computarizados , Terminología como Asunto

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA