RESUMEN
Inflammatory bowel disease (IBD), including ulcerative colitis (UC) and Crohn disease (CD), has emerged as a global disease with an increasing incidence in developing and newly industrialized regions such as South America. This global rise offers the opportunity to explore the differences and similarities in disease presentation and outcomes across different genetic backgrounds and geographic locations. Our study includes 265 IBD patients. We performed an exploratory analysis of the databases of Chilean and North American IBD patients to compare the clinical phenotypes between the cohorts. We employed an unsupervised machine-learning approach using principal component analysis, uniform manifold approximation, and projection, among others, for each disease. Finally, we predicted the cohort (North American vs Chilean) using a random forest. Several unsupervised machine learning methods have separated the 2 main groups, supporting the differences between North American and Chilean patients with each disease. The variables that explained the loadings of the clinical metadata on the principal components were related to the therapies and disease extension/location at diagnosis. Our random forest models were trained for cohort classification based on clinical characteristics, obtaining high accuracy (0.86 = UC; 0.79 = CD). Similarly, variables related to therapy and disease extension/location had a high Gini index. Similarly, univariate analysis showed a later CD age at diagnosis in Chilean IBD patients (37 vs 24; P = .005). Our study suggests a clinical difference between North American and Chilean IBD patients: later CD age at diagnosis with a predominantly less aggressive phenotype (39% vs 54% B1) and more limited disease, despite fewer biological therapies being used in Chile for both diseases.
Asunto(s)
Colitis Ulcerosa , Enfermedad de Crohn , Enfermedades Inflamatorias del Intestino , Chile/epidemiología , Colitis Ulcerosa/genética , Etnicidad , Humanos , Enfermedades Inflamatorias del Intestino/diagnóstico , América del Norte/epidemiología , FenotipoRESUMEN
The obesity epidemic progresses everywhere across the globe, and implementing frequent nationwide surveys to measure the percentage of obese population is costly. Conversely, country-level food sales information can be accessed inexpensively through different suppliers on a regular basis. This study applies a methodology to predict obesity prevalence at the country-level based on national sales of a small subset of food and beverage categories. Three machine learning algorithms for nonlinear regression were implemented using purchase and obesity prevalence data from 79 countries: support vector machines, random forests and extreme gradient boosting. The proposed method was validated in terms of both the absolute prediction error and the proportion of countries for which the obesity prevalence was predicted satisfactorily. We found that the most-relevant food category to predict obesity is baked goods and flours, followed by cheese and carbonated drinks.