RESUMEN
BACKGROUND: Atrial fibrillation (AF) is associated with substantial morbidity, especially when it goes undetected. If new-onset AF could be predicted, targeted screening could be used to find it early. We hypothesized that a deep neural network could predict new-onset AF from the resting 12-lead ECG and that this prediction may help identify those at risk of AF-related stroke. METHODS: We used 1.6 M resting 12-lead digital ECG traces from 430 000 patients collected from 1984 to 2019. Deep neural networks were trained to predict new-onset AF (within 1 year) in patients without a history of AF. Performance was evaluated using areas under the receiver operating characteristic curve and precision-recall curve. We performed an incidence-free survival analysis for a period of 30 years following the ECG stratified by model predictions. To simulate real-world deployment, we trained a separate model using all ECGs before 2010 and evaluated model performance on a test set of ECGs from 2010 through 2014 that were linked to our stroke registry. We identified the patients at risk for AF-related stroke among those predicted to be high risk for AF by the model at different prediction thresholds. RESULTS: The area under the receiver operating characteristic curve and area under the precision-recall curve were 0.85 and 0.22, respectively, for predicting new-onset AF within 1 year of an ECG. The hazard ratio for the predicted high- versus low-risk groups over a 30-year span was 7.2 (95% CI, 6.9-7.6). In a simulated deployment scenario, the model predicted new-onset AF at 1 year with a sensitivity of 69% and specificity of 81%. The number needed to screen to find 1 new case of AF was 9. This model predicted patients at high risk for new-onset AF in 62% of all patients who experienced an AF-related stroke within 3 years of the index ECG. CONCLUSIONS: Deep learning can predict new-onset AF from the 12-lead ECG in patients with no previous history of AF. This prediction may help identify patients at risk for AF-related strokes.
Asunto(s)
Fibrilación Atrial/diagnóstico , Aprendizaje Profundo/normas , Accidente Cerebrovascular/etiología , Fibrilación Atrial/complicaciones , Electrocardiografía , Femenino , Humanos , Masculino , Redes Neurales de la Computación , Accidente Cerebrovascular/mortalidad , Análisis de SupervivenciaRESUMEN
BACKGROUND: Polycystic ovary syndrome is the most common endocrine disorder affecting women of reproductive age. A number of criteria have been developed for clinical diagnosis of polycystic ovary syndrome, with the Rotterdam criteria being the most inclusive. Evidence suggests that polycystic ovary syndrome is significantly heritable, and previous studies have identified genetic variants associated with polycystic ovary syndrome diagnosed using different criteria. The widely adopted electronic health record system provides an opportunity to identify patients with polycystic ovary syndrome using the Rotterdam criteria for genetic studies. OBJECTIVE: To identify novel associated genetic variants under the same phenotype definition, we extracted polycystic ovary syndrome cases and unaffected controls based on the Rotterdam criteria from the electronic health records and performed a discovery-validation genome-wide association study. STUDY DESIGN: We developed a polycystic ovary syndrome phenotyping algorithm on the basis of the Rotterdam criteria and applied it to 3 electronic health record-linked biobanks to identify cases and controls for genetic study. In the discovery phase, we performed an individual genome-wide association study using the Geisinger MyCode and the Electronic Medical Records and Genomics cohorts, which were then meta-analyzed. We attempted validation of the significant association loci (P<1×10-6) in the BioVU cohort. All association analyses used logistic regression, assuming an additive genetic model, and adjusted for principal components to control for population stratification. An inverse-variance fixed-effect model was adopted for meta-analysis. In addition, we examined the top variants to evaluate their associations with each criterion in the phenotyping algorithm. We used the STRING database to characterize protein-protein interaction network. RESULTS: Using the same algorithm based on the Rotterdam criteria, we identified 2995 patients with polycystic ovary syndrome and 53,599 population controls in total (2742 cases and 51,438 controls from the discovery phase; 253 cases and 2161 controls in the validation phase). We identified 1 novel genome-wide significant variant rs17186366 (odds ratio [OR]=1.37 [1.23, 1.54], P=2.8×10-8) located near SOD2. In addition, 2 loci with suggestive association were also identified: rs113168128 (OR=1.72 [1.42, 2.10], P=5.2×10-8), an intronic variant of ERBB4 that is independent from the previously published variants, and rs144248326 (OR=2.13 [1.52, 2.86], P=8.45×10-7), a novel intronic variant in WWTR1. In the further association tests of the top 3 single-nucleotide polymorphisms with each criterion in the polycystic ovary syndrome algorithm, we found that rs17186366 (SOD2) was associated with polycystic ovaries and hyperandrogenism, whereas rs11316812 (ERBB4) and rs144248326 (WWTR1) were mainly associated with oligomenorrhea or infertility. We also validated the previously reported association with DENND1A1. Using the STRING database to characterize protein-protein interactions, we found both ERBB4 and WWTR1 can interact with YAP1, which has been previously associated with polycystic ovary syndrome. CONCLUSION: Through a discovery-validation genome-wide association study on polycystic ovary syndrome identified from electronic health records using an algorithm based on Rotterdam criteria, we identified and validated a novel genome-wide significant association with a variant near SOD2. We also identified a novel independent variant within ERBB4 and a suggestive association with WWTR1. With previously identified polycystic ovary syndrome gene YAP1, the ERBB4-YAP1-WWTR1 network suggests involvement of the epidermal growth factor receptor and the Hippo pathway in the multifactorial etiology of polycystic ovary syndrome.
Asunto(s)
Síndrome del Ovario Poliquístico/genética , Receptor ErbB-4/genética , Transactivadores/genética , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Adulto , Estudios de Casos y Controles , Registros Electrónicos de Salud , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Hiperandrogenismo/genética , Infertilidad Femenina/genética , Persona de Mediana Edad , Oligomenorrea/genética , Quistes Ováricos/genética , Síndrome del Ovario Poliquístico/diagnóstico , Síndrome del Ovario Poliquístico/fisiopatología , Polimorfismo de Nucleótido Simple , Superóxido Dismutasa/genética , Factores de Transcripción/metabolismo , Proteínas Coactivadoras Transcripcionales con Motivo de Unión a PDZ , Proteínas Señalizadoras YAPRESUMEN
BACKGROUND: Arrhythmogenic right ventricular cardiomyopathy (ARVC) is associated with variants in desmosome genes. Secondary findings of pathogenic/likely pathogenic variants, primarily loss-of-function (LOF) variants, are recommended for clinical reporting; however, their prevalence and associated phenotype in a general clinical population are not fully characterized. METHODS: From whole-exome sequencing of 61 019 individuals in the DiscovEHR cohort, we screened for putative loss-of-function variants in PKP2, DSC2, DSG2, and DSP. We evaluated measures from prior clinical ECG and echocardiograms, manually over-read to evaluate ARVC diagnostic criteria, and performed a PheWAS (phenome-wide association study). Finally, we estimated expected penetrance using Bayesian inference. RESULTS: One hundred forty individuals (0.23%; 59±18 years old at last encounter; 33% male) had an ARVC variant (G+). None had an existing diagnosis of ARVC in the electronic health record, nor significant differences in prior ECG or echocardiogram findings compared with matched controls without variants. Several G+ individuals satisfied major repolarization (n=4) and ventricular function (n=5) criteria, but this prevalence matched controls. PheWAS showed no significant associations of other heart disease diagnoses. Combining our best genetic and disease prevalence estimates yields an estimated penetrance of 6.0%. CONCLUSIONS: The prevalence of ARVC loss-of-function variants is ≈1:435 in a general clinical population of predominantly European descent, but with limited electronic health record-based evidence of phenotypic association in our population, consistent with a low penetrance estimate. Prospective deep phenotyping and longitudinal follow-up of a large sequenced cohort is needed to determine the true clinical relevance of an incidentally identified ARVC loss-of-function variant.