Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 158
Filtrar
1.
Cell ; 2024 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-39353437

RESUMO

Complex structural variations (cxSVs) are often overlooked in genome analyses due to detection challenges. We developed ARC-SV, a probabilistic and machine-learning-based method that enables accurate detection and reconstruction of cxSVs from standard datasets. By applying ARC-SV across 4,262 genomes representing all continental populations, we identified cxSVs as a significant source of natural human genetic variation. Rare cxSVs have a propensity to occur in neural genes and loci that underwent rapid human-specific evolution, including those regulating corticogenesis. By performing single-nucleus multiomics in postmortem brains, we discovered cxSVs associated with differential gene expression and chromatin accessibility across various brain regions and cell types. Additionally, cxSVs detected in brains of psychiatric cases are enriched for linkage with psychiatric GWAS risk alleles detected in the same brains. Furthermore, our analysis revealed significantly decreased brain-region- and cell-type-specific expression of cxSV genes, specifically for psychiatric cases, implicating cxSVs in the molecular etiology of major neuropsychiatric disorders.

2.
bioRxiv ; 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39314359

RESUMO

Vascular beds show different propensities for different vascular pathologies, yet mechanisms explaining these fundamental differences remain unknown. We sought to build a transcriptomic, cellular, and spatial atlas of human arterial cells across multiple different arterial segments to understand this phenomenon. We found significant cell type-specific segmental heterogeneity. Determinants of arterial identity are predominantly encoded in fibroblasts and smooth muscle cells, and their differentially expressed genes are particularly enriched for vascular disease-associated loci and genes. Adventitial fibroblast-specific heterogeneity in gene expression coincides with numerous vascular disease risk genes, suggesting a previously unrecognized role for this cell type in disease risk. Adult arterial cells from different segments cluster not by anatomical proximity but by embryonic origin, with differentially regulated genes heavily influenced by developmental master regulators. Non-coding transcriptomes across arterial cells contain extensive variation in lnc-RNAs expressed in cell type- and segment-specific patterns, rivaling heterogeneity in protein coding transcriptomes, and show enrichment for non-coding genetic signals for vascular diseases.

3.
Epidemiology ; 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39316822

RESUMO

BACKGROUND: Colorectal cancer (CRC) is a common, fatal cancer. Identifying subgroups who may benefit more from intervention is of critical public health importance. Previous studies have assessed multiplicative interaction between genetic risk scores and environmental factors, but few have assessed additive interaction, the relevant public health measure. METHODS: Using resources from colorectal cancer consortia including 45,247 CRC cases and 52,671 controls, we assessed multiplicative and additive interaction (relative excess risk due to interaction, RERI) using logistic regression between 13 harmonized environmental factors and genetic risk score including 141 variants associated with CRC risk. RESULTS: There was no evidence of multiplicative interaction between environmental factors and genetic risk score. There was additive interaction where, for individuals with high genetic susceptibility, either heavy drinking [RERI = 0.24, 95% confidence interval, CI, (0.13, 0.36)], ever smoking [0.11 (0.05, 0.16)], high BMI [female 0.09 (0.05, 0.13), male 0.10 (0.05, 0.14)], or high red meat intake [highest versus lowest quartile 0.18 (0.09, 0.27)] was associated with excess CRC risk greater than that for individuals with average genetic susceptibility. Conversely, we estimate those with high genetic susceptibility may benefit more from reducing CRC risk with aspirin/NSAID use [-0.16 (-0.20, -0.11)] or higher intake of fruit, fiber, or calcium [highest quartile versus lowest quartile -0.12 (-0.18, -0.050); -0.16 (-0.23, -0.09); -0.11 (-0.18, -0.05), respectively] than those with average genetic susceptibility. CONCLUSIONS: Additive interaction is important to assess for identifying subgroups who may benefit from intervention. The subgroups identified in this study may help inform precision CRC prevention.

4.
bioRxiv ; 2024 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-39282369

RESUMO

Cell types evolve into a hierarchy with related types grouped into families. How cell type diversification is constrained by the stable separation between families over vast evolutionary times remains unknown. Here, integrating single-nucleus multiomic sequencing and deep learning, we show that hundreds of sequence features (motifs) divide into distinct sets associated with accessible genomes of specific cell type families. This division is conserved across highly divergent, early-branching animals including flatworms and cnidarians. While specific interactions between motifs delineate cell type relationships within families, surprisingly, these interactions are not conserved between species. Consistently, while deep learning models trained on one species can predict accessibility of other species' sequences, their predictions frequently rely on distinct, but synonymous, motif combinations. We propose that long-term stability of cell type families is maintained through genome access specified by conserved motif sets, or 'vocabularies', whereas cell types diversify through flexible use of motifs within each set.

5.
bioRxiv ; 2024 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-39282388

RESUMO

Distant-acting enhancers are central to human development. However, our limited understanding of their functional sequence features prevents the interpretation of enhancer mutations in disease. Here, we determined the functional sensitivity to mutagenesis of human developmental enhancers in vivo. Focusing on seven enhancers active in the developing brain, heart, limb and face, we created over 1700 transgenic mice for over 260 mutagenized enhancer alleles. Systematic mutation of 12-basepair blocks collectively altered each sequence feature in each enhancer at least once. We show that 69% of all blocks are required for normal in vivo activity, with mutations more commonly resulting in loss (60%) than in gain (9%) of function. Using predictive modeling, we annotated critical nucleotides at base-pair resolution. The vast majority of motifs predicted by these machine learning models (88%) coincided with changes to in vivo function, and the models showed considerable sensitivity, identifying 59% of all functional blocks. Taken together, our results reveal that human enhancers contain a high density of sequence features required for their normal in vivo function and provide a rich resource for further exploration of human enhancer logic.

6.
bioRxiv ; 2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39345598

RESUMO

Three-dimensional nuclear DNA architecture comprises well-studied intra-chromosomal ( cis ) folding and less characterized inter-chromosomal ( trans ) interfaces. Current predictive models of 3D genome folding can effectively infer pairwise cis -chromatin interactions from the primary DNA sequence but generally ignore trans contacts. There is an unmet need for robust models of trans -genome organization that provide insights into their underlying principles and functional relevance. We present TwinC, an interpretable convolutional neural network model that reliably predicts trans contacts measurable through genome-wide chromatin conformation capture (Hi-C). TwinC uses a paired sequence design from replicate Hi-C experiments to learn single base pair relevance in trans interactions across two stretches of DNA. The method achieves high predictive accuracy (AUROC=0.80) on a cross-chromosomal test set from Hi-C experiments in heart tissue. Mechanistically, the neural network learns the importance of compartments, chromatin accessibility, clustered transcription factor binding and G-quadruplexes in forming trans contacts. In summary, TwinC models and interprets trans genome architecture, shedding light on this poorly understood aspect of gene regulation.

7.
medRxiv ; 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39211867

RESUMO

Precision medicine promises significant health benefits but faces challenges such as the need for complex data management and analytics, interdisciplinary collaboration, and education of researchers, healthcare professionals, and participants. Addressing these needs requires the integration of computational experts, engineers, designers, and healthcare professionals to develop user-friendly systems and shared terminologies. The widespread adoption of large language models (LLMs) like GPT-4 and Claude 3 highlights the importance of making complex data accessible to non-specialists. The Stanford Data Ocean (SDO) strives to mitigate these challenges through a scalable, cloud-based platform that supports data management for various data types, advanced research, and personalized learning in precision medicine. SDO provides AI tutors and AI-powered data visualization tools to enhance educational and research outcomes and make data analysis accessible for users from diverse educational backgrounds. By extending engagement and cutting-edge research capabilities globally, SDO particularly benefits economically disadvantaged and historically marginalized communities, fostering interdisciplinary biomedical research and bridging the gap between education and practical application in the biomedical field.

8.
bioRxiv ; 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38895386

RESUMO

In most eukaryotes, mitochondrial organelles contain their own genome, usually circular, which is the remnant of the genome of the ancestral bacterial endosymbiont that gave rise to modern mitochondria. Mitochondrial genomes are dramatically reduced in their gene content due to the process of endosymbiotic gene transfer to the nucleus; as a result most mitochondrial proteins are encoded in the nucleus and imported into mitochondria. This includes the components of the dedicated mitochondrial transcription and replication systems and regulatory factors, which are entirely distinct from the information processing systems in the nucleus. However, since the 1990s several nuclear transcription factors have been reported to act in mitochondria, and previously we identified 8 human and 3 mouse transcription factors (TFs) with strong localized enrichment over the mitochondrial genome using ChIP-seq (Chromatin Immunoprecipitation) datasets from the second phase of the ENCODE (Encyclopedia of DNA Elements) Project Consortium. Here, we analyze the greatly expanded in the intervening decade ENCODE compendium of TF ChIP-seq datasets (a total of 6,153 ChIP experiments for 942 proteins, of which 763 are sequence-specific TFs) combined with interpretative deep learning models of TF occupancy to create a comprehensive compendium of nuclear TFs that show evidence of association with the mitochondrial genome. We find some evidence for chrM occupancy for 50 nuclear TFs and two other proteins, with bZIP TFs emerging as most likely to be playing a role in mitochondria. However, we also observe that in cases where the same TF has been assayed with multiple antibodies and ChIP protocols, evidence for its chrM occupancy is not always reproducible. In the light of these findings, we discuss the evidential criteria for establishing chrM occupancy and reevaluate the overall compendium of putative mitochondrial-acting nuclear TFs.

9.
bioRxiv ; 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38853896

RESUMO

Despite extensive characterization of mammalian Pol II transcription, the DNA sequence determinants of transcription initiation at a third of human promoters and most enhancers remain poorly understood. Hence, we trained and interpreted a neural network called ProCapNet that accurately models base-resolution initiation profiles from PRO-cap experiments using local DNA sequence. ProCapNet learns sequence motifs with distinct effects on initiation rates and TSS positioning and uncovers context-specific cryptic initiator elements intertwined within other TF motifs. ProCapNet annotates predictive motifs in nearly all actively transcribed regulatory elements across multiple cell-lines, revealing a shared cis-regulatory logic across promoters and enhancers mediated by a highly epistatic sequence syntax of cooperative and competitive motif interactions. ProCapNet models of RAMPAGE profiles measuring steady-state RNA abundance at TSSs distill initiation signals on par with models trained directly on PRO-cap profiles. ProCapNet learns a largely cell-type-agnostic cis-regulatory code of initiation complementing sequence drivers of cell-type-specific chromatin state critical for accurate prediction of cell-type-specific transcription initiation.

10.
bioRxiv ; 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38853998

RESUMO

Deep learning approaches have made significant advances in predicting cell type-specific chromatin patterns from the identity and arrangement of transcription factor (TF) binding motifs. However, most models have been applied in unperturbed contexts, precluding a predictive understanding of how chromatin state responds to TF perturbation. Here, we used transfer learning to train and interpret deep learning models that use DNA sequence to predict, with accuracy approaching experimental reproducibility, how the concentration of two dosage-sensitive TFs (TWIST1, SOX9) affects regulatory element (RE) chromatin accessibility in facial progenitor cells. High-affinity motifs that allow for heterotypic TF co-binding and are concentrated at the center of REs buffer against quantitative changes in TF dosage and strongly predict unperturbed accessibility. In contrast, motifs with low-affinity or homotypic binding distributed throughout REs lead to sensitive responses with minimal contributions to unperturbed accessibility. Both buffering and sensitizing features show signatures of purifying selection. We validated these predictive sequence features using reporter assays and showed that a biophysical model of TF-nucleosome competition can explain the sensitizing effect of low-affinity motifs. Our approach of combining transfer learning and quantitative measurements of the chromatin response to TF dosage therefore represents a powerful method to reveal additional layers of the cis-regulatory code.

11.
Cell ; 187(16): 4408-4425.e23, 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-38925112

RESUMO

Most mammalian genes have multiple polyA sites, representing a substantial source of transcript diversity regulated by the cleavage and polyadenylation (CPA) machinery. To better understand how these proteins govern polyA site choice, we introduce CPA-Perturb-seq, a multiplexed perturbation screen dataset of 42 CPA regulators with a 3' scRNA-seq readout that enables transcriptome-wide inference of polyA site usage. We develop a framework to detect perturbation-dependent changes in polyadenylation and characterize modules of co-regulated polyA sites. We find groups of intronic polyA sites regulated by distinct components of the nuclear RNA life cycle, including elongation, splicing, termination, and surveillance. We train and validate a deep neural network (APARENT-Perturb) for tandem polyA site usage, delineating a cis-regulatory code that predicts perturbation response and reveals interactions between regulatory complexes. Our work highlights the potential for multiplexed single-cell perturbation screens to further our understanding of post-transcriptional regulation.


Assuntos
Poli A , Poliadenilação , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Poli A/metabolismo , Animais , Camundongos , Íntrons/genética , Transcriptoma/genética , RNA Mensageiro/metabolismo , RNA Mensageiro/genética , Regulação da Expressão Gênica
12.
Sci Adv ; 10(21): eadj4452, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38781344

RESUMO

Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.


Assuntos
Encéfalo , Epigênese Genética , Sequências Reguladoras de Ácido Nucleico , Humanos , Encéfalo/metabolismo , Sequências Reguladoras de Ácido Nucleico/genética , Animais , Evolução Molecular , Transtornos Mentais/genética , Elementos Reguladores de Transcrição/genética , Neurônios/metabolismo , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
13.
Sci Adv ; 10(22): eadk3121, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38809988

RESUMO

Regular, long-term aspirin use may act synergistically with genetic variants, particularly those in mechanistically relevant pathways, to confer a protective effect on colorectal cancer (CRC) risk. We leveraged pooled data from 52 clinical trial, cohort, and case-control studies that included 30,806 CRC cases and 41,861 controls of European ancestry to conduct a genome-wide interaction scan between regular aspirin/nonsteroidal anti-inflammatory drug (NSAID) use and imputed genetic variants. After adjusting for multiple comparisons, we identified statistically significant interactions between regular aspirin/NSAID use and variants in 6q24.1 (top hit rs72833769), which has evidence of influencing expression of TBC1D7 (a subunit of the TSC1-TSC2 complex, a key regulator of MTOR activity), and variants in 5p13.1 (top hit rs350047), which is associated with expression of PTGER4 (codes a cell surface receptor directly involved in the mode of action of aspirin). Genetic variants with functional impact may modulate the chemopreventive effect of regular aspirin use, and our study identifies putative previously unidentified targets for additional mechanistic interrogation.


Assuntos
Anti-Inflamatórios não Esteroides , Neoplasias Colorretais , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Neoplasias Colorretais/genética , Neoplasias Colorretais/tratamento farmacológico , Anti-Inflamatórios não Esteroides/farmacologia , Aspirina/farmacologia , Receptores de Prostaglandina E Subtipo EP4/genética , Receptores de Prostaglandina E Subtipo EP4/metabolismo , Masculino , Predisposição Genética para Doença , Feminino , Estudos de Casos e Controles , Pessoa de Meia-Idade , Loci Gênicos , Idoso
14.
EBioMedicine ; 104: 105146, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38749303

RESUMO

BACKGROUND: Consumption of fibre, fruits and vegetables have been linked with lower colorectal cancer (CRC) risk. A genome-wide gene-environment (G × E) analysis was performed to test whether genetic variants modify these associations. METHODS: A pooled sample of 45 studies including up to 69,734 participants (cases: 29,896; controls: 39,838) of European ancestry were included. To identify G × E interactions, we used the traditional 1--degree-of-freedom (DF) G × E test and to improve power a 2-step procedure and a 3DF joint test that investigates the association between a genetic variant and dietary exposure, CRC risk and G × E interaction simultaneously. FINDINGS: The 3-DF joint test revealed two significant loci with p-value <5 × 10-8. Rs4730274 close to the SLC26A3 gene showed an association with fibre (p-value: 2.4 × 10-3) and G × fibre interaction with CRC (OR per quartile of fibre increase = 0.87, 0.80, and 0.75 for CC, TC, and TT genotype, respectively; G × E p-value: 1.8 × 10-7). Rs1620977 in the NEGR1 gene showed an association with fruit intake (p-value: 1.0 × 10-8) and G × fruit interaction with CRC (OR per quartile of fruit increase = 0.75, 0.65, and 0.56 for AA, AG, and GG genotype, respectively; G × E -p-value: 0.029). INTERPRETATION: We identified 2 loci associated with fibre and fruit intake that also modify the association of these dietary factors with CRC risk. Potential mechanisms include chronic inflammatory intestinal disorders, and gut function. However, further studies are needed for mechanistic validation and replication of findings. FUNDING: National Institutes of Health, National Cancer Institute. Full funding details for the individual consortia are provided in acknowledgments.


Assuntos
Neoplasias Colorretais , Fibras na Dieta , Frutas , Interação Gene-Ambiente , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Verduras , Humanos , Neoplasias Colorretais/genética , Neoplasias Colorretais/etiologia , Fibras na Dieta/administração & dosagem , Genótipo , Dieta , Masculino , Feminino , Fatores de Risco
15.
bioRxiv ; 2024 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-38645064

RESUMO

Over the past 15 years, a variety of next-generation sequencing assays have been developed for measuring the 3D conformation of DNA in the nucleus. Each of these assays gives, for a particular cell or tissue type, a distinct picture of 3D chromatin architecture. Accordingly, making sense of the relationship between genome structure and function requires teasing apart two closely related questions: how does chromatin 3D structure change from one cell type to the next, and how do different measurements of that structure differ from one another, even when the two assays are carried out in the same cell type? In this work, we assemble a collection of chromatin 3D datasets-each represented as a 2D contact map- spanning multiple assay types and cell types. We then build a machine learning model that predicts missing contact maps in this collection. We use the model to systematically explore how genome 3D architecture changes, at the level of compartments, domains, and loops, between cell type and between assay types.

16.
Br J Cancer ; 130(10): 1687-1696, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38561434

RESUMO

BACKGROUND: Menopausal hormone therapy (MHT), a common treatment to relieve symptoms of menopause, is associated with a lower risk of colorectal cancer (CRC). To inform CRC risk prediction and MHT risk-benefit assessment, we aimed to evaluate the joint association of a polygenic risk score (PRS) for CRC and MHT on CRC risk. METHODS: We used data from 28,486 postmenopausal women (11,519 cases and 16,967 controls) of European descent. A PRS based on 141 CRC-associated genetic variants was modeled as a categorical variable in quartiles. Multiplicative interaction between PRS and MHT use was evaluated using logistic regression. Additive interaction was measured using the relative excess risk due to interaction (RERI). 30-year cumulative risks of CRC for 50-year-old women according to MHT use and PRS were calculated. RESULTS: The reduction in odds ratios by MHT use was larger in women within the highest quartile of PRS compared to that in women within the lowest quartile of PRS (p-value = 2.7 × 10-8). At the highest quartile of PRS, the 30-year CRC risk was statistically significantly lower for women taking any MHT than for women not taking any MHT, 3.7% (3.3%-4.0%) vs 6.1% (5.7%-6.5%) (difference 2.4%, P-value = 1.83 × 10-14); these differences were also statistically significant but smaller in magnitude in the lowest PRS quartile, 1.6% (1.4%-1.8%) vs 2.2% (1.9%-2.4%) (difference 0.6%, P-value = 1.01 × 10-3), indicating 4 times greater reduction in absolute risk associated with any MHT use in the highest compared to the lowest quartile of genetic CRC risk. CONCLUSIONS: MHT use has a greater impact on the reduction of CRC risk for women at higher genetic risk. These findings have implications for the development of risk prediction models for CRC and potentially for the consideration of genetic information in the risk-benefit assessment of MHT use.


Assuntos
Neoplasias Colorretais , Predisposição Genética para Doença , Humanos , Feminino , Neoplasias Colorretais/genética , Neoplasias Colorretais/epidemiologia , Pessoa de Meia-Idade , Estudos de Casos e Controles , Fatores de Risco , Idoso , Terapia de Reposição Hormonal/efeitos adversos , Medição de Risco , Menopausa , Pós-Menopausa , Terapia de Reposição de Estrogênios/efeitos adversos
17.
Nat Methods ; 21(4): 723-734, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38504114

RESUMO

The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.


Assuntos
Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Humanos , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Sistemas CRISPR-Cas/genética , Genoma , Células K562 , RNA Guia de Sistemas CRISPR-Cas
18.
STAR Protoc ; 5(2): 102941, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38483898

RESUMO

Dinoflagellate genomes often are very large and difficult to assemble, which has until recently precluded their analysis with modern functional genomic tools. Here, we present a protocol for mapping three-dimensional (3D) genome organization in dinoflagellates and using it for scaffolding their genome assemblies. We describe steps for crosslinking, nuclear lysis, denaturation, restriction digest, ligation, and DNA shearing and purification. We then detail procedures sequencing library generation and computational analysis, including initial Hi-C read mapping and 3D-DNA scaffolding/assembly correction. For complete details on the use and execution of this protocol, please refer to Marinov et al.1.


Assuntos
Dinoflagellida , Genoma de Protozoário , Dinoflagellida/genética , Genoma de Protozoário/genética , Genômica/métodos , Mapeamento Cromossômico/métodos , Análise de Sequência de DNA/métodos
19.
bioRxiv ; 2024 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-37873443

RESUMO

The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has led to significant global morbidity and mortality. A crucial viral protein, the non-structural protein 14 (nsp14), catalyzes the methylation of viral RNA and plays a critical role in viral genome replication and transcription. Due to the low mutation rate in the nsp region among various SARS-CoV-2 variants, nsp14 has emerged as a promising therapeutic target. However, discovering potential inhibitors remains a challenge. In this work, we introduce a computational pipeline for the rapid and efficient identification of potential nsp14 inhibitors by leveraging virtual screening and the NCI open compound collection, which contains 250,000 freely available molecules for researchers worldwide. The introduced pipeline provides a cost-effective and efficient approach for early-stage drug discovery by allowing researchers to evaluate promising molecules without incurring synthesis expenses. Our pipeline successfully identified seven promising candidates after experimentally validating only 40 compounds. Notably, we discovered NSC620333, a compound that exhibits a strong binding affinity to nsp14 with a dissociation constant of 427 ± 84 nM. In addition, we gained new insights into the structure and function of this protein through molecular dynamics simulations. We identified new conformational states of the protein and determined that residues Phe367, Tyr368, and Gln354 within the binding pocket serve as stabilizing residues for novel ligand interactions. We also found that metal coordination complexes are crucial for the overall function of the binding pocket. Lastly, we present the solved crystal structure of the nsp14-MTase complexed with SS148 (PDB:8BWU), a potent inhibitor of methyltransferase activity at the nanomolar level (IC50 value of 70 ± 6 nM). Our computational pipeline accurately predicted the binding pose of SS148, demonstrating its effectiveness and potential in accelerating drug discovery efforts against SARS-CoV-2 and other emerging viruses.

20.
bioRxiv ; 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-35547855

RESUMO

Clinical diagnosis typically incorporates physical examination, patient history, and various laboratory tests and imaging studies, but makes limited use of the human system's own record of antigen exposures encoded by receptors on B cells and T cells. We analyzed immune receptor datasets from 593 individuals to develop MAchine Learning for Immunological Diagnosis (Mal-ID) , an interpretive framework to screen for multiple illnesses simultaneously or precisely test for one condition. This approach detects specific infections, autoimmune disorders, vaccine responses, and disease severity differences. Human-interpretable features of the model recapitulate known immune responses to SARS-CoV-2, Influenza, and HIV, highlight antigen-specific receptors, and reveal distinct characteristics of Systemic Lupus Erythematosus and Type-1 Diabetes autoreactivity. This analysis framework has broad potential for scientific and clinical interpretation of human immune responses.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA