RESUMO
INTRODUCTION: By the end 2019 there was an outbreak of pneumonia caused by a new coronavirus, a disease that was called coronavirus disease 2019 (COVID-19). Computed tomography (CT) has played an important role in the diagnosis of COVID-19 patients. OBJECTIVE: To demonstrate inter-observer variability with five scales proposed for measuring the extent of COVID-19 pneumonia on tomography. METHODS: Thirty five initial chest CT scans of patients who attended respiratory triage for suspected COVID-19 pneumonia were analyzed. Three radiologists classified the tomographic images according to the severity scales proposed by Yang (1), Yuan (2), Chun (3), Wang (4) and Instituto Nacional de Enfermedades Respiratorias-Chung-Pan (5). The percentage of agreement between the evaluators for each scale was calculated using the intra-class correlation index. RESULTS: In most patients were five pulmonary lobes compromised (77.1% of the patients). Scales 1, 2, 4 and 5 showed an intra-class correlation > 0.91 (p < 0.0001), with agreement thus being almost perfect. CONCLUSIONS: Scale 4 (proposed by Wang) showed the best inter-observer agreement, with a coefficient of 0.964 (p = 0.001).
INTRODUCCIÓN: A finales de 2019 se presentó un brote de neumonía causada por un nuevo coronavirus, enfermedad a la que se denominó COVID-19. La tomografía computarizada ha desempeñado un papel importante en el diagnóstico de los pacientes con COVID-19. OBJETIVO: Demostrar la variabilidad interobservador con cinco escalas propuestas para la medición de la extensión de la neumonía ocasionada por COVID-19 mediante tomografía. MÉTODOS: Se analizaron 35 tomografías de tórax iniciales de pacientes que asistieron al triaje respiratorio por sospecha de neumonía por COVID-19. Tres radiólogos realizaron la clasificación de las imágenes tomográficas de acuerdo con las escalas de severidad propuestas por Yang (1), Yuan (2), Chun (3), Wang (4) e INER-Chung-Pan (5). Se calculó el porcentaje de concordancia entre los evaluadores para cada escala con el índice de correlación intraclase. RESULTADOS: La mayoría de los pacientes presentó afección de cinco lóbulos pulmonares (77.1 % de los pacientes). Las escalas 1, 2, 4 y 5 mostraron una correlación intraclase > 0.91, con p < 0.0001, por lo que la concordancia fue casi perfecta. CONCLUSIONES: La escala 4 (de Wang) mostró la mejor concordancia interobservador, con un coeficiente de 0.964 (p = 0.001).
Assuntos
COVID-19 , Pneumonia , Humanos , Variações Dependentes do Observador , Pneumonia/diagnóstico por imagem , Pneumonia/epidemiologia , Estudos Retrospectivos , SARS-CoV-2 , Tomografia Computadorizada por Raios XRESUMO
ABSTRACT Objective: The Michigan State University (MSU) classification of lumbar disc herniation (LDH) is periodically used by various authors to classify disc herniation. We assessed the reliability of this classification system among orthopedic residents at our institute. Methods: Fifty T2 axial-cut magnetic resonance images (MRI) corresponding to the level of maximal disc herniation from patients diagnosed with a single LDH were selected and distributed to six orthopedic residents. All six residents gave a specific rating for each image based on the MSU classification; in addition, three residents gave ratings on two different occasions. The degree of agreement among residents was analyzed by calculating inter-observer and intra-observer reliability using the Kappa statistic. Results: The inter-observer reliability among the six residents calculated as the Fleiss' Kappa was 0.422, which indicates moderate reliability. The intra-observer reliability of three selected residents calculated by Cohen's Kappa was 0.750, 0.772, and 0.859, which indicates substantial to almost perfect reliability. Variations in ratings were frequent in images portraying a broad-based disc herniation with spinal canal stenosis. Conclusion: Our findings demonstrate moderate homogeneity of ratings given by residents; however, test-retest results proved the ratings to be consistent. Level of Evidence II, Diagnostic studies - investigating a diagnostic examination.
RESUMO Objetivo: A classificação da hérnia de disco lombar (LDH) da Michigan State University (MSU) é usada periodicamente por vários autores para classificar as hérnias discais. Pretendemos avaliar a confiabilidade deste sistema de classificação entre os residentes de ortopedia em nosso instituto. Métodos: Cinqüenta imagens de RM axial do corte T2 correspondendo ao nível de hérnia discal máxima de pacientes que foram diagnosticados com uma única LDH foram selecionadas e distribuídas para seis residentes ortopédicos. Todos os seis residentes deram uma classificação específica para cada imagem com base na classificação MSU; Além disso, três residentes deram notas em duas ocasiões diferentes. O grau de concordância entre os residentes foi analisado calculando-se a confiabilidade interobservador e intraobservador pela estatística Kappa. Resultados: Descobrimos que a confiabilidade interobservador entre seis residentes, calculando o Kappa de Fleiss, foi de 0,422; isso indica confiabilidade moderada. No entanto, a confiabilidade intra-observador de três residentes selecionados mostrou-se substancial (Kappa de Cohen = 0,750, 0,772 e 0,859 em três residentes, respectivamente). Variações na observação foram frequentes se houvesse hérnia discal ampla com estenose do canal vertebral. Conclusão: Nossos achados demonstram homogeneidade moderada das avaliações dadas pelos residentes; no entanto, teste-reteste provou que as classificações eram consistentes. Nível de Evidencia II, Estudos diagnósticos - investigação de um exame para diagnóstico.
RESUMO
INTRODUCTION AND AIM: Transient elastography is gaining popularity as a non-invasive method for predicting liver fibrosis, but inter observer agreement and factors influencing reproducibility have not been adequately assessed. MATERIAL AND METHODS: This cross-sectional study was conducted at Specialized Medical Hospital and the Egyptian Liver Foundation, Mansoura, Egypt. The inclusion criteria were: age older than 18 years and chronic infection by hepatitis C. The exclusion criteria were the presence of ascites, pacemaker or pregnancy. Three hundred and fifty-six patients participated in the study. Therefore, 356 pairs of exams were done by two operators on the same day. RESULTS: The overall inter observer agreement ICC was 0.921. The correlation the two operators was excellent (Spearman's value q = 0.808, p < 0.001). Inter-observer reliability values were κ = 0.557 (p < 0.001). A not negligible discordance of fibrosis staging between operators was observed (87 cases, 24.4%). Discordance of at least one stage and for two or more stages of fibrosis occurred in 60 (16.9%) and 27 cases (7.6%) respectively. Obesity (BMI ≥ 30 kg/m2) is the main factor associated with discordance (p = 0.002). CONCLUSION: Although liver stiffness measurement has had an excellent correlation between the two operators, TE presented an inter-observer variability that may not be negligible.
Assuntos
Técnicas de Imagem por Elasticidade , Hepatite C Crônica/complicações , Cirrose Hepática/diagnóstico por imagem , Obesidade/complicações , Adulto , Índice de Massa Corporal , Estudos Transversais , Egito , Feminino , Hepatite C Crônica/diagnóstico , Hepatite C Crônica/virologia , Humanos , Cirrose Hepática/virologia , Masculino , Pessoa de Meia-Idade , Obesidade/diagnóstico , Variações Dependentes do Observador , Valor Preditivo dos Testes , Reprodutibilidade dos TestesRESUMO
OBJECTIVE: The Michigan State University (MSU) classification of lumbar disc herniation (LDH) is periodically used by various authors to classify disc herniation. We assessed the reliability of this classification system among orthopedic residents at our institute. METHODS: Fifty T2 axial-cut magnetic resonance images (MRI) corresponding to the level of maximal disc herniation from patients diagnosed with a single LDH were selected and distributed to six orthopedic residents. All six residents gave a specific rating for each image based on the MSU classification; in addition, three residents gave ratings on two different occasions. The degree of agreement among residents was analyzed by calculating inter-observer and intra-observer reliability using the Kappa statistic. RESULTS: The inter-observer reliability among the six residents calculated as the Fleiss' Kappa was 0.422, which indicates moderate reliability. The intra-observer reliability of three selected residents calculated by Cohen's Kappa was 0.750, 0.772, and 0.859, which indicates substantial to almost perfect reliability. Variations in ratings were frequent in images portraying a broad-based disc herniation with spinal canal stenosis. CONCLUSION: Our findings demonstrate moderate homogeneity of ratings given by residents; however, test-retest results proved the ratings to be consistent. Level of Evidence II, Diagnostic studies - investigating a diagnostic examination.
OBJETIVO: A classificação da hérnia de disco lombar (LDH) da Michigan State University (MSU) é usada periodicamente por vários autores para classificar as hérnias discais. Pretendemos avaliar a confiabilidade deste sistema de classificação entre os residentes de ortopedia em nosso instituto. MÉTODOS: Cinqüenta imagens de RM axial do corte T2 correspondendo ao nível de hérnia discal máxima de pacientes que foram diagnosticados com uma única LDH foram selecionadas e distribuídas para seis residentes ortopédicos. Todos os seis residentes deram uma classificação específica para cada imagem com base na classificação MSU; Além disso, três residentes deram notas em duas ocasiões diferentes. O grau de concordância entre os residentes foi analisado calculando-se a confiabilidade interobservador e intraobservador pela estatística Kappa. RESULTADOS: Descobrimos que a confiabilidade interobservador entre seis residentes, calculando o Kappa de Fleiss, foi de 0,422; isso indica confiabilidade moderada. No entanto, a confiabilidade intra-observador de três residentes selecionados mostrou-se substancial (Kappa de Cohen = 0,750, 0,772 e 0,859 em três residentes, respectivamente). Variações na observação foram frequentes se houvesse hérnia discal ampla com estenose do canal vertebral. CONCLUSÃO: Nossos achados demonstram homogeneidade moderada das avaliações dadas pelos residentes; no entanto, teste-reteste provou que as classificações eram consistentes. Nível de Evidencia II, Estudos diagnósticos - investigação de um exame para diagnóstico.
RESUMO
OBJECTIVE: To identify factors that influence the inter-observer reproducibility of the routine, conventional Pap smear cytology (Pap smear test) in a network of certificated laboratories in a middle-income Latin American country. METHODS: Twenty-six laboratories provided each an average of 26 negative for malignancy (NILM) and high-grade squamous intraepithelial lesion (HSIL) Pap smears. An external panel reviewed the slides. The kappa index and multilevel logistic regression were used to estimate the reproducibility and odds ratios (OR) of a false result with 95% confidence intervals (95% CI), respectively. Results are presented for laboratories that collect (collector laboratories) and do not collect (non-collector laboratories) samples. RESULTS: The agreements ranged widely (median kappa 0.51, range 0.16-0.70). The overall false-positive (FP) and false-negative (FN) rates were 31% (95% CI 27-35) and 11% (95% CI 7-17). Among collector laboratories (N = 14), a bigger sample collection volume decreased the probability of a FP (OR-adjusted 0.05, 95% CI 0.02-0.1) whereas the number of quality defects (OR-adjusted 1.67, 95% CI 1.25-2.24), high workload (OR-adjusted 5.52, 95% CI 3.85-7.92) and collection by cytotechnologists (OR-adjusted 1.28, 95% CI 1.15-1.42) or health professionals (OR-adjusted 2.26, 95% CI 2.04-2.49) instead of nursing assistants increased it. Among non-collector laboratories (N = 9), the FP rate increased with the number of quality defects (OR-adjusted 1.86, 95% CI 1.06-3.26) but decreased if the samples were collected by health professionals instead of nursing assistants (OR-adjusted 0.37, 95%CI 0.17-0.80). No significant associations were observed for FN. CONCLUSIONS: Staff in charge of cervical sampling significantly determined the reproducibility of the Pap smear test, but this depended on whether the laboratory collects samples or read samples collected elsewhere.
Assuntos
Colo do Útero/patologia , Lesões Intraepiteliais Escamosas Cervicais/patologia , Displasia do Colo do Útero/patologia , Neoplasias do Colo do Útero/patologia , Adulto , Idoso , Estudos Transversais , Feminino , Humanos , Laboratórios , Pessoa de Meia-Idade , Análise Multinível , Teste de Papanicolaou/métodos , Reprodutibilidade dos Testes , Esfregaço Vaginal/métodosRESUMO
PURPOSE: We assessed agreement among neurosurgeons on surgical approaches to individual glioblastoma patients and between their approach and those recommended by the topographical staging system described by Shinoda. METHODS: Five neurosurgeons were provided with pre-surgical MRIs of 76 patients. They selected the surgical approach [biopsy, partial resection, or gross total resection (GTR)] that they would recommend for each patient. They were blinded to each other's response and they were told that patients were younger than 50 years old and without symptoms. Three neuroradiologists classified each case according to the Shinoda staging system. RESULTS: Biopsy was recommended in 35.5-82.9%, partial resection in 6.6-32.9%, and GTR in 3.9-31.6% of cases. Agreement among their responses was fair (global kappa = 0.28). Nineteen patients were classified as stage I, 14 as stage II, and 43 as stage III. Agreement between the neurosurgeons and the recommendations of the staging system was poor for stage I (kappa = 0.14) and stage II (kappa = 0.02) and fair for stage III patients (kappa = 0.29). An individual analysis revealed that in contrast to the Shinoda system, neurosurgeons took into account T2/FLAIR sequences and gave greater weight to the involvement of eloquent areas. CONCLUSIONS: The surgical approach to glioblastoma is highly variable. A staging system could be used to examine the impact of extent of resection, monitor post-operative complications, and stratify patients in clinical trials. Our findings suggest that the Shinoda staging system could be improved by including T2/FLAIR sequences and a more adequate weighting of eloquent areas.
Assuntos
Neoplasias Encefálicas/cirurgia , Glioblastoma/cirurgia , Estadiamento de Neoplasias/métodos , Procedimentos Neurocirúrgicos/normas , Adulto , Neoplasias Encefálicas/patologia , Ensaios Clínicos Fase II como Assunto , Glioblastoma/patologia , Humanos , Masculino , Pessoa de Meia-Idade , Neurocirurgiões/normas , Procedimentos Neurocirúrgicos/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto , Inquéritos e QuestionáriosRESUMO
Introducción: La Displasia del desarrollo de la cadera (DDC) es un espectro de enfermedades que abarca desde la luxación franca de la cadera hasta la displasia acetabular leve. El screening de detección de DDC se realiza de rutina en nuestro país, mediante una radiografía de pelvis a los 3 meses. El índice acetabular medido en estas radiografías se utiliza para evaluar la cadera displásica, tanto en la presentación inicial como durante el seguimiento posterior. Objetivo: Evaluar la variabilidad tanto intra como inter observador en la medición del índice acetabular, entre profesionales médicos. Material y Métodos: Cuatro evaluadores (un cirujano-ortopédico infantil, un médico general, un pediatra y un radiólogo) realizaron la medición del índice acetabular en 100 radiografías de screening (200 caderas), en tres ocasiones, separadas por un mes cada una (600 mediciones totales). Un observador independiente evaluó la reproductibilidad en la medición. Se utilizó el coeficiente de correlación intraclase para determinar diferencias significativas. Resultados: La variabilidad intra observador fue menor que la interobservador. La variabilidad intra observador fue similar para los diferentes evaluadores, +/- 1,5°. La variabilidad inter observador fue de +/- 3,4°. Conclusiones: Demostramos una alta concordancia entre las mediciones, determinando una alta reproductibilidad del índice acetabular. El índice acetabular es un método seguro para el diagnóstico y seguimiento de displasia acetabular.
Developmental dysplasia of the hip (DDH) is a spectrum of diseases ranging from frank dislocation of the hip to mild acetabular dysplasia. DDH screening for detection is performed routinely in our country using pelvic x-ray at 3 months of age. The radiographic measured acetabular index is used to evaluate the dysplastic hip, at initial presentation and during follow-up. Objective: Evaluation of the intra- and inter-observer variability, among medical professionals, when measuring acetabular index. Methods: Four reviewers (a children orthopedic surgeon, a general practitioner, a pediatrician and a radiologist) performed acetabular index measurement in 100 radiographs (200 hips), on three occasions, separated each by one month (600 total measurements). An independent observer evaluated the measurement reproducibility. The intra-class correlation coefficient to determine significant differences was used. Results: The intra-observer variability was less than the inter-observer variability. The intra-observer variability was similar among the different assessors, +/- 1.5 degrees. The inter-observer variability was +/- 3.4 degrees. Conclusions: A high concordance among measurements was reported, evidencing a high reproducibility of the acetabular index; this index is a reliable method for the diagnosis and follow-up of acetabular dysplasia.
Assuntos
Humanos , Lactente , Acetábulo/patologia , Acetábulo , Luxação Congênita de Quadril/patologia , Luxação Congênita de Quadril , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Programas de Rastreamento/métodosRESUMO
Los criterios histológicos para determinar el grado de displasia, la clasificación de Broders y el frente de invasión tumoral (FIT) son parámetros subjetivos no cuantificables que pueden indicar el grado de evolución de displasias y carcinomas. Un factor importante a considerar durante la valoración histológica, es la variabilidad del diagnóstico entre patólogos. El objetivo es estandarizar los criterios y determinar la variabilidad intra e inter observador en el diagnóstico de displasias y COCE. Se seleccionaron y estandarizaron los criterios morfológicos para el diagnóstico y se revisaron los casos seleccionados aleatoriamente por tres patólogos bucales (30 displasias y 30 carcinomas) del Laboratorio de Patología Clínica y Experimental de la DEPeI de la FO, UNAM. Cada patólogo analizó y registró los parámetros establecidos para displasia, COCE y FIT en 2 ocasiones. Se aplicó el test Kappa para valorar la concordancia intra e inter observador. El Observador 1 v/s el 2 obtuvo una concordancia para COCE de 0,75 y en displasias de 0,60 e intraobservador de 0,90. El observador 2 v/s el 3 presentó una concordancia para COCE de 0,75 y en displasias de 0,59 e intraobservador de 0,91. El Observador 3 Vs el 1 tuvo una concordancia para COCE de 0,77, y en displasias de 0,59 e intraobservador de 0,92. La concordancia intraobservador e interobservador en COCE fue de buena a excelente, pero en displasias fue aceptable confirmando que su evaluación presenta mayor grado de dificultad. Con una adecuada estandarización se puede obtener una buena concordancia entre patólogos.
In the histological criteria for determining the degree of dysplasia, the Broders classification and the front of tumor invasion (FTI) are unquantifiable subjective parameters that may indicate the degree of development of carcinomas. An important factor to consider during the histological evaluation is the variability in the diagnosis of pathologists. The objective to standardize criteria and determine the intra and inter-observer variability in the diagnosis of dysplasias and OSCC. We selected and standardized morphological criteria for the diagnosis, and the cases were reviewed randomly by three oral pathologists (30 dysplasias and 30 carcinomas) from the Laboratory of Clinical and Experimental Pathology of the FO DEPeI, UNAM. Each pathologist analyzed and recorded the parameters for dysplasia and OSCC FIT on two occasions. Kappa test was applied to assess intra and inter-observer agreement. Observer 1 v/s 2 match for OSCC was 0.75, 0.60 for dysplasias and intra observer 0.90. Observer 2 v/s 3 presented a concordance of 0.75 for OSCC, 0.59 for dysplasias and intra-observer 0.91. Observer 3 v/s observer 1 for OSCC was 0.77, 0.59 for dysplasias and intra-observer 0.92. Intra observer and inter-observer concordance in OSCC were good or excellent, but in dysplasia was acceptable, confirming that its assessment showed the greatest difficulty with proper standardization we can obtain a better consensus between pathologists.