RESUMEN
Linked administrative data offer a rich source of information that can be harnessed to describe patterns of disease, understand their causes and evaluate interventions. However, administrative data are primarily collected for operational reasons such as recording vital events for legal purposes, and planning, provision and monitoring of services. The processes involved in generating and linking administrative datasets may generate sources of bias that are often not adequately considered by researchers. We provide a framework describing these biases, drawing on our experiences of using the 100 Million Brazilian Cohort (100MCohort) which contains records of more than 131 million people whose families applied for social assistance between 2001 and 2018. Datasets for epidemiological research were derived by linking the 100MCohort to health-related databases such as the Mortality Information System and the Hospital Information System. Using the framework, we demonstrate how selection and misclassification biases may be introduced in three different stages: registering and recording of people's life events and use of services, linkage across administrative databases, and cleaning and coding of variables from derived datasets. Finally, we suggest eight recommendations which may reduce biases when analysing data from administrative sources.
Asunto(s)
Registro Médico Coordinado , Humanos , Sesgo , Estudios Epidemiológicos , Bases de Datos Factuales , Brasil/epidemiologíaRESUMEN
BACKGROUND: Cardiovascular disease (CVD) has a disproportionate effect on mortality among the poorest people. We assessed the impact on CVD and all-cause mortality of the world's largest conditional cash transfer, Brazil's Bolsa Família Programme (BFP). METHODS: We linked administrative data from the 100 Million Brazilian Cohort with BFP receipt and national mortality data. We followed individuals who applied for BFP between 1 January 2011 and 31 December 2015, until 31 December 2015. We used marginal structural models to estimate the effect of BFP on all-age and premature (30-69 years) CVD and all-cause mortality. We conducted stratified analyses by levels of material deprivation and access to healthcare. We checked the robustness of our findings by restricting the analysis to municipalities with better mortality data and by using alternative statistical methods. RESULTS: We studied 17â981â582 individuals, of whom 4â855â324 were aged 30-69 years. Three-quarters (76.2%) received BFP, with a mean follow-up post-award of 2.6 years. We detected 106â807 deaths by all causes, of which 60â893 were premature; and 23â389 CVD deaths, of which 15â292 were premature. BFP was associated with reductions in premature all-cause mortality [hazard ratio (HR) = 0.96, 95% CI = 0.94-0.98], premature CVD (HR = 0.96, 95% CI = 0.92-1.00) and all-age CVD (HR = 0.96, 95% CI = 0.93-1.00) but not all-age all-cause mortality (HR = 1.00, 95% CI = 0.98-1.02). In stratified and robustness analyses, BFP was consistently associated with mortality reductions for individuals living in the two most deprived quintiles. CONCLUSIONS: BFP appears to have a small to null effect on premature CVD and all-cause mortality in the short term; the long-term impact remains unknown.
Asunto(s)
Enfermedades Cardiovasculares , Pobreza , Humanos , Brasil/epidemiologíaRESUMEN
The "thrifty genotype" hypothesis proposes that the high prevalence of type 2 diabetes (T2D) in Native Americans and admixed Latin Americans has a genetic basis and reflects an evolutionary adaptation to a past low calorie/high exercise lifestyle. However, identification of the gene variants underpinning this hypothesis remains elusive. Here we assessed the role of Native American ancestry, socioeconomic status (SES) and 21 candidate gene loci in susceptibility to T2D in a sample of 876 T2D cases and 399 controls from Antioquia (Colombia). Although mean Native American ancestry is significantly higher in T2D cases than in controls (32% v 29%), this difference is confounded by the correlation of ancestry with SES, which is a stronger predictor of disease status. Nominally significant association (P<0.05) was observed for markers in: TCF7L2, RBMS1, CDKAL1, ZNF239, KCNQ1 and TCF1 and a significant bias (P<0.05) towards OR>1 was observed for markers selected from previous T2D genome-wide association studies, consistent with a role for Old World variants in susceptibility to T2D in Latin Americans. No association was found to the only known Native American-specific gene variant previously associated with T2D in a Mexican sample (rs9282541 in ABCA1). An admixture mapping scan with 1,536 ancestry informative markers (AIMs) did not identify genome regions with significant deviation of ancestry in Antioquia. Exclusion analysis indicates that this scan rules out ~95% of the genome as harboring loci with ancestry risk ratios >1.22 (at P < 0.05).