Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 150
Filtrar
1.
J Chem Inf Model ; 2024 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-39302256

RESUMEN

A knowledge graph (KG) is a technique for modeling entities and their interrelations. Knowledge graph embedding (KGE) translates these entities and relationships into a continuous vector space to facilitate dense and efficient representations. In the domain of chemistry, applying KG and KGE techniques integrates heterogeneous chemical information into a coherent and user-friendly framework, enhances the representation of chemical data features, and is beneficial for downstream tasks, such as chemical property prediction. This paper begins with a comprehensive review of classical and contemporary KGE methodologies, including distance-based models, semantic matching models, and neural network-based approaches. We then catalogue the primary databases employed in chemistry and biochemistry that furnish the KGs with essential chemical data. Subsequently, we explore the latest applications of KG and KGE in chemistry, focusing on risk assessment, property prediction, and drug discovery. Finally, we discuss the current challenges to KG and KGE techniques and provide a perspective on their potential future developments.

2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975896

RESUMEN

Mechanisms of protein-DNA interactions are involved in a wide range of biological activities and processes. Accurately identifying binding sites between proteins and DNA is crucial for analyzing genetic material, exploring protein functions, and designing novel drugs. In recent years, several computational methods have been proposed as alternatives to time-consuming and expensive traditional experiments. However, accurately predicting protein-DNA binding sites still remains a challenge. Existing computational methods often rely on handcrafted features and a single-model architecture, leaving room for improvement. We propose a novel computational method, called EGPDI, based on multi-view graph embedding fusion. This approach involves the integration of Equivariant Graph Neural Networks (EGNN) and Graph Convolutional Networks II (GCNII), independently configured to profoundly mine the global and local node embedding representations. An advanced gated multi-head attention mechanism is subsequently employed to capture the attention weights of the dual embedding representations, thereby facilitating the integration of node features. Besides, extra node features from protein language models are introduced to provide more structural information. To our knowledge, this is the first time that multi-view graph embedding fusion has been applied to the task of protein-DNA binding site prediction. The results of five-fold cross-validation and independent testing demonstrate that EGPDI outperforms state-of-the-art methods. Further comparative experiments and case studies also verify the superiority and generalization ability of EGPDI.


Asunto(s)
Biología Computacional , Proteínas de Unión al ADN , ADN , Redes Neurales de la Computación , Sitios de Unión , ADN/metabolismo , ADN/química , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/química , Biología Computacional/métodos , Algoritmos , Unión Proteica
3.
Front Genet ; 15: 1399810, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38798699

RESUMEN

Increasing research findings suggest that circular RNA (circRNA) exerts a crucial function in the pathogenesis of complex human diseases by binding to miRNA. Identifying their potential interactions is of paramount importance for the diagnosis and treatment of diseases. However, long cycles, small scales, and time-consuming processes characterize previous biological wet experiments. Consequently, the use of an efficient computational model to forecast the interactions between circRNA and miRNA is gradually becoming mainstream. In this study, we present a new prediction model named BJLD-CMI. The model extracts circRNA sequence features and miRNA sequence features by applying Jaccard and Bert's method and organically integrates them to obtain CMI attribute features, and then uses the graph embedding method Line to extract CMI behavioral features based on the known circRNA-miRNA correlation graph information. And then we predict the potential circRNA-miRNA interactions by fusing the multi-angle feature information such as attribute and behavior through Autoencoder in Autoencoder Networks. BJLD-CMI attained 94.95% and 90.69% of the area under the ROC curve on the CMI-9589 and CMI-9905 datasets. When compared with existing models, the results indicate that BJLD-CMI exhibits the best overall competence. During the case study experiment, we conducted a PubMed literature search to confirm that out of the top 10 predicted CMIs, seven pairs did indeed exist. These results suggest that BJLD-CMI is an effective method for predicting interactions between circRNAs and miRNAs. It provides a valuable candidate for biological wet experiments and can reduce the burden of researchers.

4.
Molecules ; 29(10)2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38792096

RESUMEN

Modelling size-realistic nanomaterials to analyse some of their properties, such as toxicity, solubility, or electronic structure, is a current challenge in computational and theoretical chemistry. The representation of the all-atom three-dimensional structure of a nanocompound would be ideal, as it could account explicitly for structural effects. However, the use of the whole structure is tedious due to the high data management and the structural complexity that accompanies the surface of the nanoparticle. Developing appropriate tools that enable a quantitative analysis of the structure, as well as the selection of regions of interest such as the core-shell, is a crucial step toward enabling the efficient analysis and processing of model nanostructures. The aim of this study was twofold. First, we defined the NanoFingerprint, which is a representation of a nanocompound in the form of a vector based on its 3D structure. The local relationship between atoms, i.e., their coordination within successive layers of neighbours, allows the characterisation of the local structure through the atom connectivity, maintaining the information of the three-dimensional structure but increasing the management ability. Second, we present a web server, called ATENA, to generate NanoFingerprints and other tools based on the 3D structure of the nanocompounds. A case study is reported to show the validity of our new fingerprint tool and the usefulness of our server. The scientific community and also private companies have a new tool based on a public web server for exploring the toxicity of nanocompounds.

5.
Comput Biol Chem ; 110: 108079, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38704917

RESUMEN

There is growing proof suggested that circRNAs play a crucial function in diverse important biological reactions related to human diseases. Within the area of biochemistry, a massive range of wet experiments have been carried out to find out the connections of circRNA-disease in recent years. Since wet experiments are expensive and laborious, nowadays, calculation-based solutions have increasingly attracted the attention of researchers. However, the performance of these methods is restricted due to the inability to balance the distribution among various types of nodes. To remedy the problem, we present a novel computational method called GEHGAN to forecast the new relationships in this research, leveraging graph embedding and heterogeneous graph attention networks. Firstly, we calculate circRNA sequences similarity, circRNA RBP similarity, disease semantic similarity and corresponding GIP kernel similarity to construct heterogeneous graph. Secondly, a graph embedding method using random walks with jump and stay strategies is applied to obtain the preliminary embeddings of circRNAs and diseases, greatly improving the performance of the model. Thirdly, a multi-head graph attention network is employed to further update the embeddings, followed by the employment of the MLP as a predictor. As a result, the five-fold cross-validation indicates that GEHGAN achieves an outstanding AUC score of 0.9829 and an AUPR value of 0.9815 on the CircR2Diseasev2.0 database, and case studies on osteosarcoma, gastric and colorectal neoplasms further confirm the model's efficacy at identifying circRNA-disease correlations.


Asunto(s)
ARN Circular , ARN Circular/genética , Humanos , Biología Computacional , Algoritmos
6.
Entropy (Basel) ; 26(5)2024 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-38785621

RESUMEN

The integration of graph embedding technology and collaborative filtering algorithms has shown promise in enhancing the performance of recommendation systems. However, existing integrated recommendation algorithms often suffer from feature bias and lack effectiveness in personalized user recommendation. For instance, users' historical interactions with a certain class of items may inaccurately lead to recommendations of all items within that class, resulting in feature bias. Moreover, accommodating changes in user interests over time poses a significant challenge. This study introduces a novel recommendation model, RCKFM, which addresses these shortcomings by leveraging the CoFM model, TransR graph embedding model, backdoor tuning of causal inference, KL divergence, and the factorization machine model. RCKFM focuses on improving graph embedding technology, adjusting feature bias in embedding models, and achieving personalized recommendations. Specifically, it employs the TransR graph embedding model to handle various relationship types effectively, mitigates feature bias using causal inference techniques, and predicts changes in user interests through KL divergence, thereby enhancing the accuracy of personalized recommendations. Experimental evaluations conducted on publicly available datasets, including "MovieLens-1M" and "Douban dataset" from Kaggle, demonstrate the superior performance of the RCKFM model. The results indicate a significant improvement of between 3.17% and 6.81% in key indicators such as precision, recall, normalized discount cumulative gain, and hit rate in the top-10 recommendation tasks. These findings underscore the efficacy and potential impact of the proposed RCKFM model in advancing recommendation systems.

7.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38581416

RESUMEN

The inference of gene regulatory networks (GRNs) from gene expression profiles has been a key issue in systems biology, prompting many researchers to develop diverse computational methods. However, most of these methods do not reconstruct directed GRNs with regulatory types because of the lack of benchmark datasets or defects in the computational methods. Here, we collect benchmark datasets and propose a deep learning-based model, DeepFGRN, for reconstructing fine gene regulatory networks (FGRNs) with both regulation types and directions. In addition, the GRNs of real species are always large graphs with direction and high sparsity, which impede the advancement of GRN inference. Therefore, DeepFGRN builds a node bidirectional representation module to capture the directed graph embedding representation of the GRN. Specifically, the source and target generators are designed to learn the low-dimensional dense embedding of the source and target neighbors of a gene, respectively. An adversarial learning strategy is applied to iteratively learn the real neighbors of each gene. In addition, because the expression profiles of genes with regulatory associations are correlative, a correlation analysis module is designed. Specifically, this module not only fully extracts gene expression features, but also captures the correlation between regulators and target genes. Experimental results show that DeepFGRN has a competitive capability for both GRN and FGRN inference. Potential biomarkers and therapeutic drugs for breast cancer, liver cancer, lung cancer and coronavirus disease 2019 are identified based on the candidate FGRNs, providing a possible opportunity to advance our knowledge of disease treatments.


Asunto(s)
Redes Reguladoras de Genes , Neoplasias Hepáticas , Humanos , Biología de Sistemas/métodos , Transcriptoma , Algoritmos , Biología Computacional/métodos
8.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38581422

RESUMEN

Reliable cell type annotations are crucial for investigating cellular heterogeneity in single-cell omics data. Although various computational approaches have been proposed for single-cell RNA sequencing (scRNA-seq) annotation, high-quality cell labels are still lacking in single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) data, because of extreme sparsity and inconsistent chromatin accessibility between datasets. Here, we present a novel automated cell annotation method that transfers cell type information from a well-labeled scRNA-seq reference to an unlabeled scATAC-seq target, via a parallel graph neural network, in a semi-supervised manner. Unlike existing methods that utilize only gene expression or gene activity features, HyGAnno leverages genome-wide accessibility peak features to facilitate the training process. In addition, HyGAnno reconstructs a reference-target cell graph to detect cells with low prediction reliability, according to their specific graph connectivity patterns. HyGAnno was assessed across various datasets, showcasing its strengths in precise cell annotation, generating interpretable cell embeddings, robustness to noisy reference data and adaptability to tumor tissues.


Asunto(s)
Cromatina , Redes Neurales de la Computación , Reproducibilidad de los Resultados
9.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38605638

RESUMEN

Recent advances in single-cell RNA sequencing technology have eased analyses of signaling networks of cells. Recently, cell-cell interaction has been studied based on various link prediction approaches on graph-structured data. These approaches have assumptions about the likelihood of node interaction, thus showing high performance for only some specific networks. Subgraph-based methods have solved this problem and outperformed other approaches by extracting local subgraphs from a given network. In this work, we present a novel method, called Subgraph Embedding of Gene expression matrix for prediction of CEll-cell COmmunication (SEGCECO), which uses an attributed graph convolutional neural network to predict cell-cell communication from single-cell RNA-seq data. SEGCECO captures the latent and explicit attributes of undirected, attributed graphs constructed from the gene expression profile of individual cells. High-dimensional and sparse single-cell RNA-seq data make converting the data into a graphical format a daunting task. We successfully overcome this limitation by applying SoptSC, a similarity-based optimization method in which the cell-cell communication network is built using a cell-cell similarity matrix which is learned from gene expression data. We performed experiments on six datasets extracted from the human and mouse pancreas tissue. Our comparative analysis shows that SEGCECO outperforms latent feature-based approaches, and the state-of-the-art method for link prediction, WLNM, with 0.99 ROC and 99% prediction accuracy. The datasets can be found at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84133 and the code is publicly available at Github https://github.com/sheenahora/SEGCECO and Code Ocean https://codeocean.com/capsule/8244724/tree.


Asunto(s)
Comunicación Celular , Transducción de Señal , Humanos , Animales , Ratones , Comunicación Celular/genética , Aprendizaje , Redes Neurales de la Computación , Expresión Génica
10.
Med Biol Eng Comput ; 62(8): 2499-2510, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38635004

RESUMEN

A tissue sample is a valuable resource for understanding a patient's symptoms and health status in relation to tumor growth. Recent research seeks to establish a connection between tissue-specific tumor samples and genetic markers (genes). This breakthrough has paved the way for personalized cancer therapies. With this motivation, the proposed model constructs a heterogeneous network based on tumor sample-gene relation data and gene-gene interaction data. This network also incorporates tissue-specific gene expression and primary site-based gene counts as features, enabling tissue-specific predictions. Graph neural networks (GNNs) have proven effective in modeling complex interactions and predicting links within this network. The proposed model has successfully predicted tumor-gene associations by leveraging sampling-based GNNs and link layer embedding. The model's performance metrics, such as AUC-ROC scores, reached approximately 94%, demonstrating the potential of this heterogeneous network in predicting tissue-specific tumor sample-gene links. This paper's findings highlight the importance of tissue-specific associations in cancer research.


Asunto(s)
Neoplasias , Redes Neurales de la Computación , Humanos , Neoplasias/genética , Redes Reguladoras de Genes , Especificidad de Órganos/genética , Algoritmos , Biomarcadores de Tumor/genética , Regulación Neoplásica de la Expresión Génica , Curva ROC
11.
Comput Biol Med ; 174: 108398, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38608322

RESUMEN

The recurrence of low-stage lung cancer poses a challenge due to its unpredictable nature and diverse patient responses to treatments. Personalized care and patient outcomes heavily rely on early relapse identification, yet current predictive models, despite their potential, lack comprehensive genetic data. This inadequacy fuels our research focus-integrating specific genetic information, such as pathway scores, into clinical data. Our aim is to refine machine learning models for more precise relapse prediction in early-stage non-small cell lung cancer. To address the scarcity of genetic data, we employ imputation techniques, leveraging publicly available datasets such as The Cancer Genome Atlas (TCGA), integrating pathway scores into our patient cohort from the Cancer Long Survivor Artificial Intelligence Follow-up (CLARIFY) project. Through the integration of imputed pathway scores from the TCGA dataset with clinical data, our approach achieves notable strides in predicting relapse among a held-out test set of 200 patients. By training machine learning models on enriched knowledge graph data, inclusive of triples derived from pathway score imputation, we achieve a promising precision of 82% and specificity of 91%. These outcomes highlight the potential of our models as supplementary tools within tumour, node, and metastasis (TNM) classification systems, offering improved prognostic capabilities for lung cancer patients. In summary, our research underscores the significance of refining machine learning models for relapse prediction in early-stage non-small cell lung cancer. Our approach, centered on imputing pathway scores and integrating them with clinical data, not only enhances predictive performance but also demonstrates the promising role of machine learning in anticipating relapse and ultimately elevating patient outcomes.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Genómica , Neoplasias Pulmonares , Aprendizaje Automático , Humanos , Neoplasias Pulmonares/genética , Carcinoma de Pulmón de Células no Pequeñas/genética , Genómica/métodos , Recurrencia Local de Neoplasia/genética , Femenino , Masculino , Bases de Datos Genéticas
12.
Life Sci Space Res (Amst) ; 41: 64-73, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38670654

RESUMEN

Microgravity in the space environment can potentially have various negative effects on the human body, one of which is bone loss. Given the increasing frequency of human space activities, there is an urgent need to identify effective anti-osteoporosis drugs for the microgravity environment. Traditional microgravity experiments conducted in space suffer from limitations such as time-consuming procedures, high costs, and small sample sizes. In recent years, the in-silico drug discovery method has emerged as a promising strategy due to the advancements in bioinformatics and computer technology. In this study, we first collected a total of 184,915 literature articles related to microgravity and bone loss. We employed a combination of dependency path extraction and clustering techniques to extract data from the text. Afterwards, we conducted data cleaning and standardization to integrate data from several sources, including The Global Network of Biomedical Relationships (GNBR), Curated Drug-Drug Interactions Database (DDInter), Search Tool for Interacting Chemicals (STITCH), DrugBank, and Traditional Chinese Medicines Integrated Database (TCMID). Through this integration process, we constructed the Microgravity Biology Knowledge Graph (MBKG) consisting of 134,796 biological entities and 3,395,273 triplets. Subsequently, the TransE model was utilized to perform knowledge graph embedding. By calculating the distances between entities in the model space, the model successfully predicted potential drugs for treating osteoporosis and microgravity-induced bone loss. The results indicate that out of the top 10 ranked western medicines, 7 have been approved for the treatment of osteoporosis. Additionally, among the top 10 ranked traditional Chinese medicines, 5 have scientific literature supporting their effectiveness in treating bone loss. Among the top 20 predicted medicines for microgravity-induced bone loss, 15 have been studied in microgravity or simulated microgravity environments, while the remaining 5 are also applicable for treating osteoporosis. This research highlights the potential application of MBKG in the field of space drug discovery.


Asunto(s)
Osteoporosis , Ingravidez , Humanos , Osteoporosis/tratamiento farmacológico , Descubrimiento de Drogas , Conservadores de la Densidad Ósea/uso terapéutico , Biología Computacional/métodos , Simulación por Computador
13.
Acta Crystallogr A Found Adv ; 80(Pt 3): 282-292, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38683646

RESUMEN

Following the work of Day & Hawthorne [Acta Cryst. (2022), A78, 212-233] and Day et al. [Acta Cryst. (2024), A80, 258-281], the program GraphT-T has been developed to embed graphical representations of observed and hypothetical chains of (SiO4)4- tetrahedra into 2D and 3D Euclidean space. During embedding, the distance between linked vertices (T-T distances) and the distance between unlinked vertices (T...T separations) in the resultant unit-distance graph are restrained to the average observed distance between linked Si tetrahedra (3.06±0.15 Å) and the minimum separation between unlinked vertices is restrained to be equal to or greater than the minimum distance between unlinked Si tetrahedra (3.713 Å) in silicate minerals. The notional interactions between vertices are described by a 3D spring-force algorithm in which the attractive forces between linked vertices behave according to Hooke's law and the repulsive forces between unlinked vertices behave according to Coulomb's law. Embedding parameters (i.e. spring coefficient, k, and Coulomb's constant, K) are iteratively refined during embedding to determine if it is possible to embed a given graph to produce a unit-distance graph with T-T distances and T...T separations that are compatible with the observed T-T distances and T...T separations in crystal structures. The resultant unit-distance graphs are denoted as compatible and may form crystal structures if and only if all distances between linked vertices (T-T distances) agree with the average observed distance between linked Si tetrahedra (3.06±0.15 Å) and the minimum separation between unlinked vertices is equal to or greater than the minimum distance between unlinked Si tetrahedra (3.713 Å) in silicate minerals. If the unit-distance graph does not satisfy these conditions, it is considered incompatible and the corresponding chain of tetrahedra is unlikely to form crystal structures. Using GraphT-T, Day et al. [Acta Cryst. (2024), A80, 258-281] have shown that several topological properties of chain graphs influence the flexibility (and rigidity) of the corresponding chains of Si tetrahedra and may explain why particular compatible chain arrangements (and the minerals in which they occur) are more common than others and/or why incompatible chain arrangements do not occur in crystals despite being topologically possible.

14.
Acta Crystallogr A Found Adv ; 80(Pt 3): 258-281, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38683645

RESUMEN

In Part I of this series, all topologically possible 1-periodic infinite graphs (chain graphs) representing chains of tetrahedra with up to 6-8 vertices (tetrahedra) per repeat unit were generated. This paper examines possible restraints on embedding these chain graphs into Euclidean space such that they are compatible with the metrics of chains of tetrahedra in observed crystal structures. Chain-silicate minerals with T = Si4+ (plus P5+, V5+, As5+, Al3+, Fe3+, B3+, Be2+, Zn2+ and Mg2+) have a grand nearest-neighbour ⟨T-T⟩ distance of 3.06±0.15 Šand a minimum T...T separation of 3.71 Šbetween non-nearest-neighbour tetrahedra, and in order for embedded chain graphs (called unit-distance graphs) to be possible atomic arrangements in crystals, they must conform to these metrics, a process termed equalization. It is shown that equalization of all acyclic chain graphs is possible in 2D and 3D, and that equalization of most cyclic chain graphs is possible in 3D but not necessarily in 2D. All unique ways in which non-isomorphic vertices may be moved are designated modes of geometric modification. If a mode (m) is applied to an equalized unit-distance graph such that a new geometrically distinct unit-distance graph is produced without changing the lengths of any edges, the mode is designated as valid (mv); if a new geometrically distinct unit-distance graph cannot be produced, the mode is invalid (mi). The parameters mv and mi are used to define ranges of rigidity of the unit-distance graphs, and are related to the edge-to-vertex ratio, e/n, of the parent chain graph. The program GraphT-T was developed to embed any chain graph into Euclidean space subject to the metric restraints on T-T and T...T. Embedding a selection of chain graphs with differing e/n ratios shows that the principal reason why many topologically possible chains cannot occur in crystal structures is due to violation of the requirement that T...T > 3.71 Å. Such a restraint becomes increasingly restrictive as e/n increases and indicates why chains with stoichiometry TO<2.5 do not occur in crystal structures.

15.
PeerJ Comput Sci ; 10: e1808, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38435603

RESUMEN

The purpose of knowledge embedding is to extract entities and relations from the knowledge graph into low-dimensional dense vectors, in order to be applied to downstream tasks, such as connection prediction and intelligent classification. Existing knowledge embedding methods still have many limitations, such as the contradiction between the vast amount of data and limited computing power, and the challenge of effectively representing rare entities. This article proposed a knowledge embedding learning model, which incorporates a graph attention mechanism to integrate key node information. It can effectively aggregate key information from the global graph structure, shield redundant information, and represent rare nodes in the knowledge base independently of its own structure. We introduce a relation update layer to further update the relation based on the results of entity training. The experiment shows that our method matches or surpasses the performance of other baseline models in link prediction on the FB15K-237 dataset. The metric Hits@1 has increased by 10.9% compared to the second-ranked baseline model. In addition, we conducted further analysis on rare nodes with fewer neighborhoods, confirming that our model can embed rare nodes more accurately than the baseline models.

16.
Front Neurosci ; 18: 1303741, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38525375

RESUMEN

Brain network analysis provides essential insights into the diagnosis of brain disease. Integrating multiple neuroimaging modalities has been demonstrated to be more effective than using a single modality for brain network analysis. However, a majority of existing brain network analysis methods based on multiple modalities often overlook both complementary information and unique characteristics from various modalities. To tackle this issue, we propose the Beta-Informativeness-Diffusion Multilayer Graph Embedding (BID-MGE) method. The proposed method seamlessly integrates structural connectivity (SC) and functional connectivity (FC) to learn more comprehensive information for diagnosing neuropsychiatric disorders. Specifically, a novel beta distribution mapping function (beta mapping) is utilized to increase vital information and weaken insignificant connections. The refined information helps the diffusion process concentrate on crucial brain regions to capture more discriminative features. To maximize the preservation of the unique characteristics of each modality, we design an optimal scale multilayer brain network, the inter-layer connections of which depend on node informativeness. Then, a multilayer informativeness diffusion is proposed to capture complementary information and unique characteristics from various modalities and generate node representations by incorporating the features of each node with those of their connected nodes. Finally, the node representations are reconfigured using principal component analysis (PCA), and cosine distances are calculated with reference to multiple templates for statistical analysis and classification. We implement the proposed method for brain network analysis of neuropsychiatric disorders. The results indicate that our method effectively identifies crucial brain regions associated with diseases, providing valuable insights into the pathology of the disease, and surpasses other advanced methods in classification performance.

17.
Med Image Anal ; 94: 103144, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38518530

RESUMEN

Recently, functional magnetic resonance imaging (fMRI) based functional connectivity network (FCN) analysis via graph convolutional networks (GCNs) has shown promise for automated diagnosis of brain diseases by regarding the FCNs as irregular graph-structured data. However, multiview information and site influences of the FCNs in a multisite, multiatlas fMRI scenario have been understudied. In this paper, we propose a Class-consistency and Site-independence Multiview Hyperedge-Aware HyperGraph Embedding Learning (CcSi-MHAHGEL) framework to integrate FCNs constructed on multiple brain atlases in a multisite fMRI study. Specifically, for each subject, we first model brain network as a hypergraph for every brain atlas to characterize high-order relations among multiple vertexes, and then introduce a multiview hyperedge-aware hypergraph convolutional network (HGCN) to extract a multiatlas-based FCN embedding where hyperedge weights are adaptively learned rather than employing the fixed weights precalculated in traditional HGCNs. In addition, we formulate two modules to jointly learn the multiatlas-based FCN embeddings by considering the between-subject associations across classes and sites, respectively, i.e., a class-consistency module to encourage both compactness within every class and separation between classes for promoting discrimination in the embedding space, and a site-independence module to minimize the site dependence of the embeddings for mitigating undesired site influences due to differences in scanning platforms and/or protocols at multiple sites. Finally, the multiatlas-based FCN embeddings are fed into a few fully connected layers followed by the soft-max classifier for diagnosis decision. Extensive experiments on the ABIDE demonstrate the effectiveness of our method for autism spectrum disorder (ASD) identification. Furthermore, our method is interpretable by revealing ASD-relevant brain regions that are biologically significant.


Asunto(s)
Trastorno del Espectro Autista , Encefalopatías , Humanos , Imagen por Resonancia Magnética , Aprendizaje , Encéfalo/diagnóstico por imagen
18.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38493342

RESUMEN

Dynamic compartmentalization of eukaryotic DNA into active and repressed states enables diverse transcriptional programs to arise from a single genetic blueprint, whereas its dysregulation can be strongly linked to a broad spectrum of diseases. While single-cell Hi-C experiments allow for chromosome conformation profiling across many cells, they are still expensive and not widely available for most labs. Here, we propose an alternate approach, scENCORE, to computationally reconstruct chromatin compartments from the more affordable and widely accessible single-cell epigenetic data. First, scENCORE constructs a long-range epigenetic correlation graph to mimic chromatin interaction frequencies, where nodes and edges represent genome bins and their correlations. Then, it learns the node embeddings to cluster genome regions into A/B compartments and aligns different graphs to quantify chromatin conformation changes across conditions. Benchmarking using cell-type-matched Hi-C experiments demonstrates that scENCORE can robustly reconstruct A/B compartments in a cell-type-specific manner. Furthermore, our chromatin confirmation switching studies highlight substantial compartment-switching events that may introduce substantial regulatory and transcriptional changes in psychiatric disease. In summary, scENCORE allows accurate and cost-effective A/B compartment reconstruction to delineate higher-order chromatin structure heterogeneity in complex tissues.


Asunto(s)
Cromatina , Cromosomas , Cromatina/genética , ADN , Conformación Molecular , Epigénesis Genética
19.
Neural Netw ; 172: 106151, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38301339

RESUMEN

Representation learning on temporal interaction graphs (TIG) aims to model complex networks with the dynamic evolution of interactions on a wide range of web and social graph applications. However, most existing works on TIG either (a) rely on discretely updated node embeddings merely when an interaction occurs that fail to capture the continuous evolution of embedding trajectories of nodes, or (b) overlook the rich temporal patterns hidden in the ever-changing graph data that presumably lead to sub-optimal models. In this paper, we propose a two-module framework named ConTIG, a novel representation learning method on TIG that captures the continuous dynamic evolution of node embedding trajectories. With two essential modules, our model exploits three-fold factors in dynamic networks including latest interaction, neighbor features, and inherent characteristics. In the first update module, we employ a continuous inference block to learn the nodes' state trajectories from time-adjacent interaction patterns using ordinary differential equations. In the second transform module, we introduce a self-attention mechanism to predict future node embeddings by aggregating historical temporal interaction information. Experiment results demonstrate the superiority of ConTIG on temporal link prediction, temporal node recommendation, and dynamic node classification tasks of four datasets compared with a range of state-of-the-art baselines, especially for long-interval interaction prediction.


Asunto(s)
Aprendizaje Automático
20.
Neural Netw ; 172: 106143, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38309139

RESUMEN

Entity alignment aims to construct a complete knowledge graph (KG) by matching the same entities in multi-source KGs. Existing researches on entity alignment mainly focuses on static multi-relational data in knowledge graphs. However, the relationships or attributes between entities often possess temporal characteristics as well. Neglecting these temporal characteristics can frequently lead to alignment errors. Compared to studying entity alignment in temporal knowledge graphs, there are relatively few efforts on entity alignment in cross-lingual temporal knowledge graphs. Therefore, in this paper, we put forward an entity alignment method for cross-lingual temporal knowledge graphs, namely CTEA. Based on GCN and TransE, CTEA combines entity embeddings, relation embeddings and attribute embeddings to design a joint embedding model, which is more conducive to generating transferable entity embedding. In the meantime, the distance calculation between elements and the similarity calculation of entity pairs are combined to enhance the reliability of cross-lingual entity alignment. Experiments shows that the proposed CTEA model improves Hits@m and MRR by about 0.8∼2.4 percentage points compared with the latest methods.


Asunto(s)
Conocimiento , Reconocimiento de Normas Patrones Automatizadas , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA