Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 145
Filtrar
1.
Comput Biol Chem ; 113: 108191, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39243549

RESUMEN

Polycystic ovary syndrome (PCOS) is one of the most common anovulatory disorder observed in women presenting with infertility. Several high and low throughput studies on PCOS have led to accumulation of vast amount of information on PCOS. Despite the availability of several resources which index the advances in PCOS, information on its etiology still remains inadequate. Analysis of the existing information using an integrated evidence based approach may aid identification of novel potential candidate genes with a role in PCOS pathophysiology. This work focuses on integrating existing information on PCOS from literature and gene expression studies and evaluating the application of gene prioritization and network analysis to predict missing novel candidates. Further, it assesses the utility of evidence-based scoring to rank genes for their association with PCOS. The results of this study led to identification of ∼2000 plausible candidate genes associated with PCOS. Insilico validation of these identified candidates confirmed the role of 938 genes in PCOS. Further, experimental validation was carried out for four of the potential candidate genes, a high-scoring (PROS1), two mid-scoring (C1QA and KNG1), and a low-scoring gene (VTN) involved in the complement and coagulation pathway by comparing protein levels in follicular fluid in women with PCOS and healthy controls. While the expression of PROS1, C1QA, and KNG1 was found to be significantly downregulated in women with PCOS, the expression of VTN was found to be unchanged in PCOS. The findings of this study reiterate the utility of employing insilico approaches to identify and prioritize the most promising candidate genes in diseases with a complex pathophysiology like PCOS. Further, the study also helps in gaining clearer insights into the molecular mechanisms associated with the manifestation of the PCOS phenotype by contributing to the existing repertoire of genes associated with PCOS.

2.
Am J Hum Genet ; 2024 Sep 04.
Artículo en Inglés | MEDLINE | ID: mdl-39255797

RESUMEN

Phenotype-driven gene prioritization is fundamental to diagnosing rare genetic disorders. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models (LLMs) promise a streamlined text-to-gene solution. In this study, we evaluated five LLMs, including two generative pre-trained transformers (GPT) series and three Llama2 series, assessing their performance across task completeness, gene prediction accuracy, and adherence to required output structures. We conducted experiments, exploring various combinations of models, prompts, phenotypic input types, and task difficulty levels. Our findings revealed that the best-performed LLM, GPT-4, achieved an average accuracy of 17.0% in identifying diagnosed genes within the top 50 predictions, which still falls behind traditional tools. However, accuracy increased with the model size. Consistent results were observed over time, as shown in the dataset curated after 2023. Advanced techniques such as retrieval-augmented generation (RAG) and few-shot learning did not improve the accuracy. Sophisticated prompts were more likely to enhance task completeness, especially in smaller models. Conversely, complicated prompts tended to decrease output structure compliance rate. LLMs also achieved better-than-random prediction accuracy with free-text input, though performance was slightly lower than with standardized concept input. Bias analysis showed that highly cited genes, such as BRCA1, TP53, and PTEN, are more likely to be predicted. Our study provides valuable insights into integrating LLMs with genomic analysis, contributing to the ongoing discussion on their utilization in clinical workflows.

3.
Brief Funct Genomics ; 2024 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-39228011

RESUMEN

Rapidly identifying candidate genes underlying major QTLs is crucial for improving rice (Oryza sativa L.). In this study, we developed a workflow to rapidly prioritize candidate genes underpinning 99 major QTLs governing yield component traits. This workflow integrates multiomics databases, including sequence variation, gene expression, gene ontology, co-expression analysis, and protein-protein interaction. We predicted 206 candidate genes for 99 reported QTLs governing ten economically important yield-contributing traits using this approach. Among these, transcription factors belonging to families of MADS-box, WRKY, helix-loop-helix, TCP, MYB, GRAS, auxin response factor, and nuclear transcription factor Y subunit were promising. Validation of key prioritized candidate genes in contrasting rice genotypes for sequence variation and differential expression identified Leucine-Rich Repeat family protein (LOC_Os03g28270) and cytochrome P450 (LOC_Os02g57290) as candidate genes for the major QTLs GL1 and pl2.1, which govern grain length and panicle length, respectively. In conclusion, this study demonstrates that our workflow can significantly narrow down a large number of annotated genes in a QTL to a very small number of the most probable candidates, achieving approximately a 21-fold reduction. These candidate genes have potential implications for enhancing rice yield.

4.
Comput Struct Biotechnol J ; 23: 2277-2288, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38840833

RESUMEN

The increasing availability of RNA sequencing data has opened up numerous opportunities to analyze various RNA interactions, including microRNA-target interactions (MTIs). In response to the necessity for a specialized tool to study MTIs in cancer and normal tissues, we developed AmiCa (https://amica.omics.si/), a web server designed for comprehensive analysis of mature microRNA (miRNA) and gene expression in 32 cancer types. Data from 9498 tumor samples and 626 normal samples from The Cancer Genome Atlas were obtained through the Genomic Data Commons and used to calculate differential expression and miRNA-target gene (MTI) correlations. AmiCa provides data on differential expression of miRNAs/genes for cancers for which normal tissue samples were available. In addition, the server calculates and presents correlations separately for tumor and normal samples for cancers for which normal samples are available. Furthermore, it enables the exploration of miRNA/gene expression in all cancer types with different miRNA/gene expression. In addition, AmiCa includes a ranking system for genes and miRNAs that can be used to identify those that are particularly highly expressed in certain cancers compared to other cancers, facilitating targeted and cancer-specific research. Finally, the functionality of AmiCa is illustrated by two case studies.

5.
Trends Genet ; 40(8): 642-667, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38734482

RESUMEN

Genome-wide association studies (GWASs) have identified numerous genetic loci associated with human traits and diseases. However, pinpointing the causal genes remains a challenge, which impedes the translation of GWAS findings into biological insights and medical applications. In this review, we provide an in-depth overview of the methods and technologies used for prioritizing genes from GWAS loci, including gene-based association tests, integrative analysis of GWAS and molecular quantitative trait loci (xQTL) data, linking GWAS variants to target genes through enhancer-gene connection maps, and network-based prioritization. We also outline strategies for generating context-dependent xQTL data and their applications in gene prioritization. We further highlight the potential of gene prioritization in drug repurposing. Lastly, we discuss future challenges and opportunities in this field.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Humanos , Sitios de Carácter Cuantitativo/genética , Estudio de Asociación del Genoma Completo/métodos , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple/genética , Redes Reguladoras de Genes/genética
6.
Am J Hum Genet ; 111(6): 1035-1046, 2024 06 06.
Artículo en Inglés | MEDLINE | ID: mdl-38754426

RESUMEN

Obesity is a major risk factor for a myriad of diseases, affecting >600 million people worldwide. Genome-wide association studies (GWASs) have identified hundreds of genetic variants that influence body mass index (BMI), a commonly used metric to assess obesity risk. Most variants are non-coding and likely act through regulating genes nearby. Here, we apply multiple computational methods to prioritize the likely causal gene(s) within each of the 536 previously reported GWAS-identified BMI-associated loci. We performed summary-data-based Mendelian randomization (SMR), FINEMAP, DEPICT, MAGMA, transcriptome-wide association studies (TWASs), mutation significance cutoff (MSC), polygenic priority score (PoPS), and the nearest gene strategy. Results of each method were weighted based on their success in identifying genes known to be implicated in obesity, ranking all prioritized genes according to a confidence score (minimum: 0; max: 28). We identified 292 high-scoring genes (≥11) in 264 loci, including genes known to play a role in body weight regulation (e.g., DGKI, ANKRD26, MC4R, LEPR, BDNF, GIPR, AKT3, KAT8, MTOR) and genes related to comorbidities (e.g., FGFR1, ISL1, TFAP2B, PARK2, TCF7L2, GSK3B). For most of the high-scoring genes, however, we found limited or no evidence for a role in obesity, including the top-scoring gene BPTF. Many of the top-scoring genes seem to act through a neuronal regulation of body weight, whereas others affect peripheral pathways, including circadian rhythm, insulin secretion, and glucose and carbohydrate homeostasis. The characterization of these likely causal genes can increase our understanding of the underlying biology and offer avenues to develop therapeutics for weight loss.


Asunto(s)
Índice de Masa Corporal , Estudio de Asociación del Genoma Completo , Obesidad , Humanos , Obesidad/genética , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Herencia Multifactorial/genética , Sitios Genéticos , Análisis de la Aleatorización Mendeliana
7.
medRxiv ; 2024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38798390

RESUMEN

Background: Schizophrenia genome-wide association studies (GWASes) have identified >250 significant loci and prioritized >100 disease-related genes. However, gene prioritization efforts have mostly been restricted to locus-based methods that ignore information from the rest of the genome. Methods: To more accurately characterize genes involved in schizophrenia etiology, we applied a combination of highly-predictive tools to a published GWAS of 67,390 schizophrenia cases and 94,015 controls. We combined both locus-based methods (fine-mapped coding variants, distance to GWAS signals) and genome-wide methods (PoPS, MAGMA, ultra-rare coding variant burden tests). To validate our findings, we compared them with previous prioritization efforts, known neurodevelopmental genes, and results from the PsyOPS tool. Results: We prioritized 62 schizophrenia genes, 41 of which were also highlighted by our validation methods. In addition to DRD2, the principal target of antipsychotics, we prioritized 9 genes that are targeted by approved or investigational drugs. These included drugs targeting glutamatergic receptors (GRIN2A and GRM3), calcium channels (CACNA1C and CACNB2), and GABAB receptor (GABBR2). These also included genes in loci that are shared with an addiction GWAS (e.g. PDE4B and VRK2). Conclusions: We curated a high-quality list of 62 genes that likely play a role in the development of schizophrenia. Developing or repurposing drugs that target these genes may lead to a new generation of schizophrenia therapies. Rodent models of addiction more closely resemble the human disorder than rodent models of schizophrenia. As such, genes prioritized for both disorders could be explored in rodent addiction models, potentially facilitating drug development.

8.
Hum Genomics ; 18(1): 34, 2024 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-38566255

RESUMEN

BACKGROUND: Male-pattern baldness (MPB) is the most common cause of hair loss in men. It can be categorized into three types: type 2 (T2), type 3 (T3), and type 4 (T4), with type 1 (T1) being considered normal. Although various MPB-associated genetic variants have been suggested, a comprehensive study for linking these variants to gene expression regulation has not been performed to the best of our knowledge. RESULTS: In this study, we prioritized MPB-related tissue panels using tissue-specific enrichment analysis and utilized single-tissue panels from genotype-tissue expression version 8, as well as cross-tissue panels from context-specific genetics. Through a transcriptome-wide association study and colocalization analysis, we identified 52, 75, and 144 MPB associations for T2, T3, and T4, respectively. To assess the causality of MPB genes, we performed a conditional and joint analysis, which revealed 10, 11, and 54 putative causality genes for T2, T3, and T4, respectively. Finally, we conducted drug repositioning and identified potential drug candidates that are connected to MPB-associated genes. CONCLUSIONS: Overall, through an integrative analysis of gene expression and genotype data, we have identified robust MPB susceptibility genes that may help uncover the underlying molecular mechanisms and the novel drug candidates that may alleviate MPB.


Asunto(s)
Alopecia , Transcriptoma , Humanos , Masculino , Transcriptoma/genética , Alopecia/genética , Alopecia/metabolismo , Genotipo , Pronóstico , Estudio de Asociación del Genoma Completo , Predisposición Genética a la Enfermedad
9.
Comput Biol Chem ; 110: 108038, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38461796

RESUMEN

The local disruptions caused by the genes of one disease can influence the pathways associated with the other diseases resulting in comorbidity. For gene therapies, it is necessary to prioritize the key genes that regulate common biological mechanisms to tackle the issues caused by overlapping diseases. This work proposes a clustering-based computational approach for prioritising the comorbid genes within the overlapping disease modules by analyzing Protein-Protein Interaction networks. For this, a sub-network with gene interactions of the disease pair was extracted from the interactome. The edge weights are assigned by combining the pairwise gene expression correlation and betweenness centrality scores. Further, a weighted graph clustering algorithm is applied and dominant nodes of high-density clusters are ranked based on clustering coefficients and neighborhood connectivity. Case studies based on neurodegenerative diseases such as Amyotrophic Lateral Sclerosis- Spinal Muscular Atrophy (ALS-SMA) pair and cancers such as Ovarian Carcinoma-Invasive Ductal Breast Carcinoma (OC-IDBC) pair were conducted to examine the efficacy of the proposed method. To identify the mechanistic role of top-ranked genes, we used Functional and Pathway enrichment analysis, connectivity analysis with leave-one-out (LOO) method, analysis of associated disease-related protein complexes, and prioritization tools such as TOPPGENE and Heml2.0. From pathway analysis, it was observed that the top 10 genes obtained using the proposed method were associated with 10 pathways in ALS-SMA comorbidity and 15 in the case of OC-IDBC, while that in similar methods like SAPDSB and S2B were 4, 6 respectively for ALS-SMA and 9, 10 respectively for OC-IDBC. In both case studies, 70 % of the disease-specific benchmark protein complexes were linked to top-ranked genes of the proposed method while that of SAPDSB and S2B were 55 % and 60 % respectively. Additionally, it was found that the removal of the top 10 genes disconnect the network into 14 distinct components in the case of ALS-SMA and 9 in the case of OC-IDBC. The experimental results shows that the proposed method can be effectively used for identifying key genes in comorbidity and can offer insights about the intricate molecular relationship driving comorbid diseases.


Asunto(s)
Esclerosis Amiotrófica Lateral , Humanos , Esclerosis Amiotrófica Lateral/genética , Mapas de Interacción de Proteínas/genética , Análisis por Conglomerados , Transcriptoma/genética , Algoritmos , Redes Reguladoras de Genes , Femenino , Biología Computacional , Comorbilidad , Atrofia Muscular Espinal/genética , Neoplasias Ováricas/genética
10.
Hum Genomics ; 18(1): 15, 2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38326862

RESUMEN

BACKGROUND: It is valuable to analyze the genome-wide association studies (GWAS) data for a complex disease phenotype in the context of the protein-protein interaction (PPI) network, as the related pathophysiology results from the function of interacting polyprotein pathways. The analysis may include the design and curation of a phenotype-specific GWAS meta-database incorporating genotypic and eQTL data linking to PPI and other biological datasets, and the development of systematic workflows for PPI network-based data integration toward protein and pathway prioritization. Here, we pursued this analysis for blood pressure (BP) regulation. METHODS: The relational scheme of the implemented in Microsoft SQL Server BP-GWAS meta-database enabled the combined storage of: GWAS data and attributes mined from GWAS Catalog and the literature, Ensembl-defined SNP-transcript associations, and GTEx eQTL data. The BP-protein interactome was reconstructed from the PICKLE PPI meta-database, extending the GWAS-deduced network with the shortest paths connecting all GWAS-proteins into one component. The shortest-path intermediates were considered as BP-related. For protein prioritization, we combined a new integrated GWAS-based scoring scheme with two network-based criteria: one considering the protein role in the reconstructed by shortest-path (RbSP) interactome and one novel promoting the common neighbors of GWAS-prioritized proteins. Prioritized proteins were ranked by the number of satisfied criteria. RESULTS: The meta-database includes 6687 variants linked with 1167 BP-associated protein-coding genes. The GWAS-deduced PPI network includes 1065 proteins, with 672 forming a connected component. The RbSP interactome contains 1443 additional, network-deduced proteins and indicated that essentially all BP-GWAS proteins are at most second neighbors. The prioritized BP-protein set was derived from the union of the most BP-significant by any of the GWAS-based or the network-based criteria. It included 335 proteins, with ~ 2/3 deduced from the BP PPI network extension and 126 prioritized by at least two criteria. ESR1 was the only protein satisfying all three criteria, followed in the top-10 by INSR, PTN11, CDK6, CSK, NOS3, SH2B3, ATP2B1, FES and FINC, satisfying two. Pathway analysis of the RbSP interactome revealed numerous bioprocesses, which are indeed functionally supported as BP-associated, extending our understanding about BP regulation. CONCLUSIONS: The implemented workflow could be used for other multifactorial diseases.


Asunto(s)
Estudio de Asociación del Genoma Completo , Mapas de Interacción de Proteínas , Humanos , Mapas de Interacción de Proteínas/genética , Estudio de Asociación del Genoma Completo/métodos , Presión Sanguínea/genética , Genotipo , Bases de Datos Factuales , ATPasas Transportadoras de Calcio de la Membrana Plasmática
11.
Genome Biol ; 25(1): 1, 2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38167462

RESUMEN

BACKGROUND: The vast majority of findings from human genome-wide association studies (GWAS) map to non-coding sequences, complicating their mechanistic interpretations and clinical translations. Non-coding sequences that are evolutionarily conserved and biochemically active could offer clues to the mechanisms underpinning GWAS discoveries. However, genetic effects of such sequences have not been systematically examined across a wide range of human tissues and traits, hampering progress to fully understand regulatory causes of human complex traits. RESULTS: Here we develop a simple yet effective strategy to identify functional elements exhibiting high levels of human-mouse sequence conservation and enhancer-like biochemical activity, which scales well to 313 epigenomic datasets across 106 human tissues and cell types. Combined with 468 GWAS of European (EUR) and East Asian (EAS) ancestries, these elements show tissue-specific enrichments of heritability and causal variants for many traits, which are significantly stronger than enrichments based on enhancers without sequence conservation. These elements also help prioritize candidate genes that are functionally relevant to body mass index (BMI) and schizophrenia but were not reported in previous GWAS with large sample sizes. CONCLUSIONS: Our findings provide a comprehensive assessment of how sequence-conserved enhancer-like elements affect complex traits in diverse tissues and demonstrate a generalizable strategy of integrating evolutionary and biochemical data to elucidate human disease genetics.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Ratones , Animales , Epigenómica , Fenotipo , Elementos de Facilitación Genéticos , Polimorfismo de Nucleótido Simple
12.
HGG Adv ; 5(1): 100252, 2024 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-37859345

RESUMEN

Previous genome-wide association studies (GWASs) for adiponectin, a complex trait linked to type 2 diabetes and obesity, identified >20 associated loci. However, most loci were identified in populations of European ancestry, and many of the target genes underlying the associations remain unknown. We conducted a cross-ancestry adiponectin GWAS meta-analysis in ≤46,434 individuals from the Metabolic Syndrome in Men (METSIM) cohort and the ADIPOGen and AGEN consortiums. We combined study-specific association summary statistics using a fixed-effects, inverse variance-weighted approach. We identified 22 loci associated with adiponectin (p < 5×10-8), including 15 known and seven previously unreported loci. Among individuals of European ancestry, Genome-wide Complex Traits Analysis joint conditional analysis (GCTA-COJO) identified 14 additional distinct signals at the ADIPOQ, CDH13, HCAR1, and ZNF664 loci. Leveraging the cross-ancestry data, FINEMAP + SuSiE identified 45 causal variants (PP > 0.9), which also exhibited potential pleiotropy for cardiometabolic traits. To prioritize target genes at associated loci, we propose a combinatorial likelihood scoring formalism (Gene Priority Score [GPScore]) based on measures derived from 11 gene prioritization strategies and the physical distance to the transcription start site. With GPScore, we prioritize the 30 most probable target genes underlying the adiponectin-associated variants in the cross-ancestry analysis, including well-known causal genes (e.g., ADIPOQ, CDH13) and additional genes (e.g., CSF1, RGS17). Functional association networks revealed complex interactions of prioritized genes, their functionally connected genes, and their underlying pathways centered around insulin and adiponectin signaling, indicating an essential role in regulating energy balance in the body, inflammation, coagulation, fibrinolysis, insulin resistance, and diabetes. Overall, our analyses identify and characterize adiponectin association signals and inform experimental interrogation of target genes for adiponectin.


Asunto(s)
Diabetes Mellitus Tipo 2 , Síndrome Metabólico , Masculino , Humanos , Adiponectina/genética , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Predisposición Genética a la Enfermedad/genética , Síndrome Metabólico/genética
13.
Brain ; 147(3): 887-899, 2024 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-37804111

RESUMEN

There are 78 loci associated with Parkinson's disease in the most recent genome-wide association study (GWAS), yet the specific genes driving these associations are mostly unknown. Herein, we aimed to nominate the top candidate gene from each Parkinson's disease locus and identify variants and pathways potentially involved in Parkinson's disease. We trained a machine learning model to predict Parkinson's disease-associated genes from GWAS loci using genomic, transcriptomic and epigenomic data from brain tissues and dopaminergic neurons. We nominated candidate genes in each locus and identified novel pathways potentially involved in Parkinson's disease, such as the inositol phosphate biosynthetic pathway (INPP5F, IP6K2, ITPKB and PPIP5K2). Specific common coding variants in SPNS1 and MLX may be involved in Parkinson's disease, and burden tests of rare variants further support that CNIP3, LSM7, NUCKS1 and the polyol/inositol phosphate biosynthetic pathway are associated with the disease. Functional studies are needed to further analyse the involvements of these genes and pathways in Parkinson's disease.


Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedad de Parkinson , Humanos , Enfermedad de Parkinson/genética , Fosfatos de Inositol , Neuronas Dopaminérgicas , Aprendizaje Automático , Fosfotransferasas (Aceptor del Grupo Fosfato)
14.
Res Sq ; 2023 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-37886583

RESUMEN

We developed a computational framework that integrates Genome-Wide Association Studies (GWAS) and post-GWAS analyses, designed to facilitate drug repurposing for COVID-19 treatment. The comprehensive approach combines transcriptomic-wide associations, polygenic priority scoring, 3D genomics, viral-host protein-protein interactions, and small-molecule docking. Through GWAS, we identified nine druggable host genes associated with COVID-19 severity and SARS-CoV-2 infection, all of which show differential expression in COVID-19 patients. These genes include IFNAR1, IFNAR2, TYK2, IL10RB, CXCR6, CCR9, and OAS1. We performed an extensive molecular docking analysis of these targets using 553 small molecules derived from five therapeutically enriched categories, namely antibacterials, antivirals, antineoplastics, immunosuppressants, and anti-inflammatories. This analysis, which comprised over 20,000 individual docking analyses, enabled the identification of several promising drug candidates. All results are available via the DockCoV2 database (https://dockcov2.org/drugs/). The computational framework ultimately identified nine potential drug candidates: Peginterferon alfa-2b, Interferon alfa-2b, Interferon beta-1b, Ruxolitinib, Dactinomycin, Rolitetracycline, Irinotecan, Vinblastine, and Oritavancin. While its current focus is on COVID-19, our proposed computational framework can be applied more broadly to assist in drug repurposing efforts for a variety of diseases. Overall, this study underscores the potential of human genetic studies and the utility of a computational framework for drug repurposing in the context of COVID-19 treatment, providing a valuable resource for researchers in this field.

15.
Front Genet ; 14: 1190863, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37867597

RESUMEN

Background: Alzheimer's disease (AD) is a complex disorder, and its risk is influenced by multiple genetic and environmental factors. In this study, an AD risk gene prediction framework based on spatial and temporal features of gene expression data (STGE) was proposed. Methods: We proposed an AD risk gene prediction framework based on spatial and temporal features of gene expression data. The gene expression data of providers of different tissues and ages were used as model features. Human genes were classified as AD risk or non-risk sets based on information extracted from relevant databases. Support vector machine (SVM) models were constructed to capture the expression patterns of genes believed to contribute to the risk of AD. Results: The recursive feature elimination (RFE) method was utilized for feature selection. Data for 64 tissue-age features were obtained before feature selection, and this number was reduced to 19 after RFE was performed. The SVM models were built and evaluated using 19 selected and full features. The area under curve (AUC) values for the SVM model based on 19 selected features (0.740 [0.690-0.790]) and full feature sets (0.730 [0.678-0.769]) were very similar. Fifteen genes predicted to be risk genes for AD with a probability greater than 90% were obtained. Conclusion: The newly proposed framework performed comparably to previous prediction methods based on protein-protein interaction (PPI) network properties. A list of 15 candidate genes for AD risk was also generated to provide data support for further studies on the genetic etiology of AD.

16.
BMC Med Genomics ; 16(1): 208, 2023 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-37667328

RESUMEN

BACKGROUND: Attention deficit hyperactivity disorder (ADHD) is commonly associated with developmental dyslexia (DD), which are both prevalent and complicated pediatric neurodevelopmental disorders that have a significant influence on children's learning and development. Clinically, the comorbidity incidence of DD and ADHD is between 25 and 48%. Children with DD and ADHD may have more severe cognitive deficiencies, a poorer level of schooling, and a higher risk of social and emotional management disorders. Furthermore, patients with this comorbidity are frequently treated for a single condition in clinical settings, and the therapeutic outcome is poor. The development of effective treatment approaches against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and treatment. In this study, we developed bioinformatical methodology for the analysis of the comorbidity of these two diseases. As such, the search for candidate genes related to the comorbid conditions of ADHD and DD can help in elucidating the molecular mechanisms underlying the comorbid condition, and can also be useful for genotyping and identifying new drug targets. RESULTS: Using the ANDSystem tool, the reconstruction and analysis of gene networks associated with ADHD and dyslexia was carried out. The gene network of ADHD included 599 genes/proteins and 148,978 interactions, while that of dyslexia included 167 genes/proteins and 27,083 interactions. When the ANDSystem and GeneCards data were combined, a total of 213 genes/proteins for ADHD and dyslexia were found. An approach for ranking genes implicated in the comorbid condition of the two diseases was proposed. The approach is based on ten criteria for ranking genes by their importance, including relevance scores of association between disease and genes, standard methods of gene prioritization, as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analyzed genes. Among the top 20 genes with the highest priority DRD2, DRD4, CNTNAP2 and GRIN2B are mentioned in the literature as directly linked with the comorbidity of ADHD and dyslexia. According to the proposed approach, the genes OPRM1, CHRNA4 and SNCA had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the most relevant genes are involved in biological processes related to signal transduction, positive regulation of transcription from RNA polymerase II promoters, chemical synaptic transmission, response to drugs, ion transmembrane transport, nervous system development, cell adhesion, and neuron migration. CONCLUSIONS: The application of methods of reconstruction and analysis of gene networks is a powerful tool for studying the molecular mechanisms of comorbid conditions. The method put forth to rank genes by their importance for the comorbid condition of ADHD and dyslexia was employed to predict genes that play key roles in the development of the comorbid condition. The results can be utilized to plan experiments for the identification of novel candidate genes and search for novel pharmacological targets.


Asunto(s)
Trastorno por Déficit de Atención con Hiperactividad , Dislexia , Humanos , Niño , Trastorno por Déficit de Atención con Hiperactividad/complicaciones , Trastorno por Déficit de Atención con Hiperactividad/epidemiología , Trastorno por Déficit de Atención con Hiperactividad/genética , Redes Reguladoras de Genes , Dislexia/complicaciones , Dislexia/epidemiología , Dislexia/genética , Comorbilidad , Movimiento Celular
17.
BMC Bioinformatics ; 24(1): 355, 2023 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-37735349

RESUMEN

BACKGROUND: Genome-wide association studies (GWAS) have identified hundreds of genetic loci associated with kidney function. By combining these findings with post-GWAS information (e.g., statistical fine-mapping to identify independent association signals and to narrow down signals to causal variants; or different sources of annotation data), new hypotheses regarding physiology and disease aetiology can be obtained. These hypotheses need to be tested in laboratory experiments, for example, to identify new therapeutic targets. For this purpose, the evidence obtained from GWAS and post-GWAS analyses must be processed and presented in a way that they are easily accessible to kidney researchers without specific GWAS expertise. MAIN: Here we present KidneyGPS, a user-friendly web-application that combines genetic variant association for estimated glomerular filtration rate (eGFR) from the Chronic Kidney Disease Genetics consortium with annotation of (i) genetic variants with functional or regulatory effects ("SNP-to-gene" mapping), (ii) genes with kidney phenotypes in mice or human ("gene-to-phenotype"), and (iii) drugability of genes (to support re-purposing). KidneyGPS adopts a comprehensive approach summarizing evidence for all 5906 genes in the 424 GWAS loci for eGFR identified previously and the 35,885 variants in the 99% credible sets of 594 independent signals. KidneyGPS enables user-friendly access to the abundance of information by search functions for genes, variants, and regions. KidneyGPS also provides a function ("GPS tab") to generate lists of genes with specific characteristics thus enabling customizable Gene Prioritisation (GPS). These specific characteristics can be as broad as any gene in the 424 loci with a known kidney phenotype in mice or human; or they can be highly focussed on genes mapping to genetic variants or signals with particularly with high statistical support. KidneyGPS is implemented with RShiny in a modularized fashion to facilitate update of input data ( https://kidneygps.ur.de/gps/ ). CONCLUSION: With the focus on kidney function related evidence, KidneyGPS fills a gap between large general platforms for accessing GWAS and post-GWAS results and the specific needs of the kidney research community. This makes KidneyGPS an important platform for kidney researchers to help translate in silico research results into in vitro or in vivo research.


Asunto(s)
Estudio de Asociación del Genoma Completo , Insuficiencia Renal Crónica , Humanos , Animales , Ratones , Fenotipo , Riñón , Mapeo Cromosómico
18.
Cell Genom ; 3(7): 100341, 2023 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-37492104

RESUMEN

Drugs targeting genes linked to disease via evidence from human genetics have increased odds of approval. Approaches to prioritize such genes include genome-wide association studies (GWASs), rare variant burden tests in exome sequencing studies (Exome), or integration of a GWAS with expression/protein quantitative trait loci (eQTL/pQTL-GWAS). Here, we compare gene-prioritization approaches on 30 clinically relevant traits and benchmark their ability to recover drug targets. Across traits, prioritized genes were enriched for drug targets with odds ratios (ORs) of 2.17, 2.04, 1.81, and 1.31 for the GWAS, eQTL-GWAS, Exome, and pQTL-GWAS methods, respectively. Adjusting for differences in testable genes and sample sizes, GWAS outperforms e/pQTL-GWAS, but not the Exome approach. Furthermore, performance increased through gene network diffusion, although the node degree, being the best predictor (OR = 8.7), revealed strong bias in literature-curated networks. In conclusion, we systematically assessed strategies to prioritize drug target genes, highlighting the promises and pitfalls of current approaches.

19.
Front Genet ; 14: 1195213, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37424726

RESUMEN

Background: Nasal polyps (NP) are benign inflammatory growths of nasal and paranasal sinus mucosa that can substantially impair patients' quality of life by various symptoms such as nasal obstruction, insomnia, and anosmia. NP often relapse even after surgical treatment, and the curative therapy would be challenging without understanding the underlying mechanisms. Genome wide association studies (GWASs) on NP have been conducted; however, few genes that are causally associated with NP have been identified. Methods: We aimed to prioritize NP associated genes for functional follow-up studies using the summary data-based Mendelian Randomization (SMR) and Bayesian colocalization (COLOC) methods to integrate the summary-level data of the GWAS on NP and the expression quantitative trait locus (eQTL) study in blood. We utilized the GWAS data including 5,554 NP cases and 258,553 controls with 34 genome-wide significant loci from the FinnGen consortium (data freeze 8) and the eQTL data from 31,684 participants of predominantly European ancestry from the eQTLGen consortium. Results: The SMR analysis identified several genes including TNFRSF18, CTSK, and IRF1 that were associated with NP due to not linkage but pleiotropy or causality. The COLOC analysis strongly suggested that these genes and the trait of NP were affected by shared causal variants, and thus were colocalized. An enrichment analysis by Metascape suggested that these genes might be involved in the biological process of cellular response to cytokine stimulus. Conclusion: We could prioritize several NP associated genes including TNFRSF18, CTSK, and IRF1 for follow-up functional studies in future to elucidate the underlying disease mechanisms.

20.
HGG Adv ; 4(3): 100203, 2023 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-37250495

RESUMEN

We introduce a user-friendly tool for risk gene, cell type, and drug prioritization for complex traits: GCDPipe. It uses gene-level GWAS-derived data and gene expression data to train a model for the identification of disease risk genes and relevant cell types. Gene prioritization information is then coupled with known drug target data to search for applicable drug agents based on their estimated functional effects on the identified risk genes. We illustrate the utility of our approach in different settings: identification of the cell types, implicated in disease pathogenesis, was tested in inflammatory bowel disease (IBD) and Alzheimer disease (AD); gene target and drug prioritization was tested in IBD and schizophrenia. The analysis of phenotypes with known disease-affected cell types and/or existing drug candidates shows that GCDPipe is an effective tool to unify genetic risk factors with cellular context and known drug targets. Next, analysis of the AD data with GCDPipe suggested that gene targets of diuretics, as an Anatomical Therapeutic Chemical drug subgroup, are significantly enriched among the genes prioritized by GCDPipe, indicating their possible effect on the course of the disease.


Asunto(s)
Enfermedad de Alzheimer , Enfermedades Inflamatorias del Intestino , Esquizofrenia , Humanos , Enfermedad de Alzheimer/tratamiento farmacológico , Diuréticos/farmacología , Esquizofrenia/genética , Enfermedades Inflamatorias del Intestino/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA