Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 289
Filtrar
1.
BMC Med Res Methodol ; 24(1): 198, 2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39251921

RESUMEN

In settings requiring synthetic data generation based on a clinical cohort, e.g., due to data protection regulations, heterogeneity across individuals might be a nuisance that we need to control or faithfully preserve. The sources of such heterogeneity might be known, e.g., as indicated by sub-groups labels, or might be unknown and thus reflected only in properties of distributions, such as bimodality or skewness. We investigate how such heterogeneity can be preserved and controlled when obtaining synthetic data from variational autoencoders (VAEs), i.e., a generative deep learning technique that utilizes a low-dimensional latent representation. To faithfully reproduce unknown heterogeneity reflected in marginal distributions, we propose to combine VAEs with pre-transformations. For dealing with known heterogeneity due to sub-groups, we complement VAEs with models for group membership, specifically from propensity score regression. The evaluation is performed with a realistic simulation design that features sub-groups and challenging marginal distributions. The proposed approach faithfully recovers the latter, compared to synthetic data approaches that focus purely on marginal distributions. Propensity scores add complementary information, e.g., when visualized in the latent space, and enable sampling of synthetic data with or without sub-group specific characteristics. We also illustrate the proposed approach with real data from an international stroke trial that exhibits considerable distribution differences between study sites, in addition to bimodality. These results indicate that describing heterogeneity by statistical approaches, such as propensity score regression, might be more generally useful for complementing generative deep learning for obtaining synthetic data that faithfully reflects structure from clinical cohorts.


Asunto(s)
Puntaje de Propensión , Humanos , Aprendizaje Profundo , Algoritmos , Simulación por Computador
2.
Genome Biol ; 25(1): 229, 2024 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-39237934

RESUMEN

Messenger RNA splicing and degradation are critical for gene expression regulation, the abnormality of which leads to diseases. Previous methods for estimating kinetic rates have limitations, assuming uniform rates across cells. DeepKINET is a deep generative model that estimates splicing and degradation rates at single-cell resolution from scRNA-seq data. DeepKINET outperforms existing methods on simulated and metabolic labeling datasets. Applied to forebrain and breast cancer data, it identifies RNA-binding proteins responsible for kinetic rate diversity. DeepKINET also analyzes the effects of splicing factor mutations on target genes in erythroid lineage cells. DeepKINET effectively reveals cellular heterogeneity in post-transcriptional regulation.


Asunto(s)
Empalme del ARN , Análisis de la Célula Individual , Humanos , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Estabilidad del ARN , Prosencéfalo/metabolismo , Proteínas de Unión al ARN/metabolismo , Proteínas de Unión al ARN/genética , Animales , Femenino
3.
ACS Appl Mater Interfaces ; 16(37): 49673-49686, 2024 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-39231373

RESUMEN

In this paper, a multineural network fusion freestyle metasurface on-demand design method is proposed. The on-demand design method involves rapidly generating corresponding metasurface patterns based on the user-defined spectrum. The generated patterns are then input into a simulator to predict their corresponding S-parameter spectrogram, which is subsequently analyzed against the real S-parameter spectrogram to verify whether the generated metasurface patterns meet the desired requirements. The methodology is based on three neural network models: a Wasserstein Generative Adversarial Network model with a U-net architecture (U-WGAN) for inverse structural design, a Variational Autoencoder (VAE) model for compression, and an LSTM + Attention model for forward S-parameter spectrum prediction validation. The U-WGAN is utilized for on-demand reverse structural design, aiming to rapidly discover high-fidelity metasurface patterns that meet specific electromagnetic spectrum responses. The VAE, as a probabilistic generation model, serves as a bridge, mapping input data to latent space and transforming it into latent variable data, providing crucial input for a forward S-parameter spectrum prediction model. The LSTM + Attention network, acting as a forward S-parameter spectrum prediction model, can accurately and efficiently predict the S-parameter spectrum corresponding to the latent variable data and compare it with the real spectrum. In addition, the digits "0" and "1" are used in the design to represent vacuum and metallic materials, respectively, and a 10 × 10 cell array of freestyle metasurface patterns is constructed. The significance of the research method proposed in this paper lies in the following: (1) The freestyle metasurface design significantly expands the possibility of metamaterial design, enabling the creation of diverse metasurface structures that are difficult to achieve with traditional methods. (2) The on-demand design approach can generate high-fidelity metasurface patterns that meet the expected electromagnetic characteristics and responses. (3) The fusion of multiple neural networks demonstrates high flexibility, allowing for the adjustment of network structures and training methods based on specific design requirements and data characteristics, thus better accommodating different design problems and optimization objectives.

4.
ACS Nano ; 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39288200

RESUMEN

DNA-stabilized silver nanoclusters (AgN-DNAs) have sequence-tuned compositions and fluorescence colors. High-throughput experiments together with supervised machine learning models have recently enabled design of DNA templates that select for AgN-DNA properties, including near-infrared (NIR) emission that holds promise for deep tissue bioimaging. However, these existing models do not enable simultaneous selection of multiple AgN-DNA properties, and require significant expert input for feature engineering and class definitions. This work presents a model for multiobjective, continuous-property design of AgN-DNAs with automatic feature extraction, based on variational autoencoders (VAEs). This model is generative, i.e., it learns both the forward mapping from DNA sequence to AgN-DNA properties and the inverse mapping from properties to sequence, and is trained on an experimental data set of DNA sequences paired with AgN-DNA fluorescence properties. Experimental testing shows that the model enables effective design of AgN-DNA emission, including bright NIR AgN-DNAs with 4-fold greater abundance compared to training data. In addition, Shapley analysis is employed to discern learned nucleobase patterns that correspond to fluorescence color and brightness. This generative model can be adapted for a range of biomolecular systems with sequence-dependent properties, enabling precise design of emerging biomolecular nanomaterials.

5.
Heliyon ; 10(15): e35407, 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39166054

RESUMEN

In the context of burgeoning industrial advancement, there is an increasing trend towards the integration of intelligence and precision in mechanical equipment. Central to the functionality of such equipment is the rolling bearing, whose operational integrity significantly impacts the overall performance of the machinery. This underscores the imperative for reliable fault diagnosis mechanisms in the continuous monitoring of rolling bearing conditions within industrial production environments. Vibration signals are primarily used for fault diagnosis in mechanical equipment because they provide comprehensive information about the equipment's condition. However, fault data often contain high noise levels, high-frequency variations, and irregularities, along with a significant amount of redundant information, like duplication, overlap, and unnecessary information during signal transmission. These characteristics present considerable challenges for effective fault feature extraction and diagnosis, reducing the accuracy and reliability of traditional fault detection methods. This research introduces an innovative fault diagnosis methodology for rolling bearings using deep convolutional neural networks (CNNs) enhanced with variational autoencoders (VAEs). This deep learning approach aims to precisely identify and classify faults by extracting detailed vibration signal features. The VAE enhances noise robustness, while the CNN improves signal data expressiveness, addressing issues like gradient vanishing and explosion. The model employs the reparameterization trick for unsupervised learning of latent features and further trains with the CNN. The system incorporates adaptive threshold methods, the "3/5" strategy, and Dropout methods. The diagnosis accuracy of the VAE-CNN model for different fault types at different rotational speeds typically reaches more than 90 %, and it achieves a generally acceptable diagnosis result. Meanwhile, the VAE-CNN augmented fault diagnosis model, after experimental validation in various dimensions, can achieve more satisfactory diagnosis results for various fault types compared to several representative deep neural network models without VAE augmentation, significantly improving the accuracy and robustness of rolling bearing fault diagnosis.

6.
Bioinformatics ; 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39172488

RESUMEN

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) enables comprehensive characterization of the cell state. However, its destructive nature prohibits measuring gene expression changes during dynamic processes such as embryogenesis. Although recent studies integrating scRNA-seq with lineage tracing have provided clonal insights between progenitor and mature cells, challenges remain. Because of their experimental nature, observations are sparse, and cells observed in the early state are not the exact progenitors of cells observed at later time points. To overcome these limitations, we developed LineageVAE, a novel computational methodology that utilizes deep learning based on the property that cells sharing barcodes have identical progenitors. RESULTS: LineageVAE is a deep generative model that transforms scRNA-seq observations with identical lineage barcodes into sequential trajectories toward a common progenitor in a latent cell state space. This method enables the reconstruction of unobservable cell state transitions, historical transcriptomes, and regulatory dynamics at a single-cell resolution. Applied to hematopoiesis and reprogrammed fibroblast datasets, LineageVAE demonstrated its ability to restore backward cell state transitions and infer progenitor heterogeneity and transcription factor activity along differentiation trajectories. AVAILABILITY AND IMPLEMENTATION: The LineageVAE model was implemented in Python using the PyTorch deep learning library. The code is available on GitHub at https://github.com/LzrRacer/LineageVAE/. SUPPLEMENTARY INFORMATION: Available at Bioinformatics online.

7.
PeerJ Comput Sci ; 10: e2216, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39145234

RESUMEN

Piwi-interacting RNA (piRNA) is a type of non-coding small RNA that is highly expressed in mammalian testis. PiRNA has been implicated in various human diseases, but the experimental validation of piRNA-disease associations is costly and time-consuming. In this article, a novel computational method for predicting piRNA-disease associations using a multi-channel graph variational autoencoder (MC-GVAE) is proposed. This method integrates four types of similarity networks for piRNAs and diseases, which are derived from piRNA sequences, disease semantics, piRNA Gaussian Interaction Profile (GIP) kernel, and disease GIP kernel, respectively. These networks are modeled by a graph VAE framework, which can learn low-dimensional and informative feature representations for piRNAs and diseases. Then, a multi-channel method is used to fuse the feature representations from different networks. Finally, a three-layer neural network classifier is applied to predict the potential associations between piRNAs and diseases. The method was evaluated on a benchmark dataset containing 5,002 experimentally validated associations with 4,350 piRNAs and 21 diseases, constructed from the piRDisease v1.0 database. It achieved state-of-the-art performance, with an average AUC value of 0.9310 and an AUPR value of 0.9247 under five-fold cross-validation. This demonstrates the method's effectiveness and superiority in piRNA-disease association prediction.

8.
G3 (Bethesda) ; 2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39148415

RESUMEN

The recent acceleration in genome sequencing targeting previously unexplored parts of the tree of life presents computational challenges. Samples collected from the wild often contain sequences from several organisms, including the target, its cobionts, and contaminants. Effective methods are therefore needed to separate sequences. Though advances in sequencing technology make this task easier, it remains difficult to taxonomically assign sequences from eukaryotic taxa that are not well-represented in databases. Therefore, reference-based methods alone are insufficient. Here, I examine how we can take advantage of differences in sequence composition between organisms to identify symbionts, parasites and contaminants in samples, with minimal reliance on reference data. To this end, I explore data from the Darwin Tree of Life project, including hundreds of high-quality HiFi read sets from insects. Visualising two-dimensional representations of read tetranucleotide composition learned by a Variational Autoencoder can reveal distinct components of a sample. Annotating the embeddings with additional information, such as coding density, estimated coverage, or taxonomic labels allows rapid assessment of the contents of a dataset. The approach scales to millions of sequences, making it possible to explore unassembled read sets, even for large genomes. Combined with interactive visualisation tools, it allows a large fraction of cobionts reported by reference-based screening to be identified. Crucially, it also facilitates retrieving genomes for which suitable reference data are absent.

9.
Neural Netw ; 180: 106600, 2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-39208463

RESUMEN

Few-shot learning is often challenged by low generalization performance due to the model is mostly learned with the base classes only. To mitigate the above issues, a few-shot learning method with representative global prototype is proposed in this paper. Specifically, to enhance generalization to novel class, we propose a strategy for jointly training base and novel classes. This process produces prototypes characterizing the class information called representative global prototypes. Additionally, to avoid the problem of data imbalance and prototype bias caused by newly added categories of sparse samples, a novel sample synthesis method is proposed for augmenting more representative samples of novel class. Finally, representative samples and non-representative samples with high uncertainty are selected to enhance the representational and discriminative abilities of the global prototype. Intensive experiments have been conducted on two popular benchmark datasets, and the experimental results show that this method significantly improves the classification ability of few-shot learning tasks and achieves state-of-the-art performance.

10.
Math Biosci Eng ; 21(7): 6608-6630, 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39176411

RESUMEN

Feature representations with rich topic information can greatly improve the performance of story segmentation tasks. VAEGAN offers distinct advantages in feature learning by combining variational autoencoder (VAE) and generative adversarial network (GAN), which not only captures intricate data representations through VAE's probabilistic encoding and decoding mechanism but also enhances feature diversity and quality via GAN's adversarial training. To better learn topical domain representation, we used a topical classifier to supervise the training process of VAEGAN. Based on the learned feature, a segmentor splits the document into shorter ones with different topics. Hidden Markov model (HMM) is a popular approach for story segmentation, in which stories are viewed as instances of topics (hidden states). The number of states has to be set manually but it is often unknown in real scenarios. To solve this problem, we proposed an infinite HMM (IHMM) approach which utilized an HDP prior on transition matrices over countably infinite state spaces to automatically infer the state's number from the data. Given a running text, a Blocked Gibbis sampler labeled the states with topic classes. The position where the topic changes was a story boundary. Experimental results on the TDT2 corpus demonstrated that the proposed topical VAEGAN-IHMM approach was significantly better than the traditional HMM method in story segmentation tasks and achieved state-of-the-art performance.

11.
Neuroimage ; 299: 120806, 2024 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-39179011

RESUMEN

Recent studies indicate that differences in cognition among individuals may be partially attributed to unique brain wiring patterns. While functional connectivity (FC)-based fingerprinting has demonstrated high accuracy in identifying adults, early studies on neonates suggest that individualized FC signatures are absent. We posit that individual uniqueness is present in neonatal FC data and that conventional linear models fail to capture the rapid developmental trajectories characteristic of newborn brains. To explore this hypothesis, we employed a deep generative model, known as a variational autoencoder (VAE), leveraging two extensive public datasets: one comprising resting-state functional MRI (rs-fMRI) scans from 100 adults and the other from 464 neonates. VAE models trained on rs-fMRI from both adults and newborns produced superior age prediction performance (with r between predicted- and actual age ∼ 0.7) and individual identification accuracy (∼45 %) compared to models trained solely on adult or neonatal data. The VAE model also showed significantly higher individual identification accuracy than linear models (=10∼30 %). Importantly, the VAE differentiated connections reflecting age-related changes from those indicative of individual uniqueness, a distinction not possible with linear models. Moreover, we derived 20 latent variables, each corresponding to distinct patterns of cortical functional network (CFNs). These CFNs varied in their representation of brain maturation and individual signatures; notably, certain CFNs that failed to capture neurodevelopmental traits, in fact, exhibited individual signatures. CFNs associated with neonatal neurodevelopment predominantly encompassed unimodal regions such as visual and sensorimotor areas, whereas those linked to individual uniqueness spanned multimodal and transmodal brain regions. The VAE's capacity to extract features from rs-fMRI data beyond the capabilities of linear models positions it as a valuable tool for delineating cognitive traits inherent in rs-fMRI and exploring individualized imaging phenotypes.


Asunto(s)
Encéfalo , Conectoma , Aprendizaje Profundo , Imagen por Resonancia Magnética , Humanos , Recién Nacido , Conectoma/métodos , Imagen por Resonancia Magnética/métodos , Masculino , Femenino , Adulto , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Encéfalo/crecimiento & desarrollo , Adulto Joven , Red Nerviosa/diagnóstico por imagen , Red Nerviosa/fisiología
12.
Heart Rhythm ; 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39168295

RESUMEN

BACKGROUND: Arrhythmogenic right ventricular cardiomyopathy (ARVC) is a rare genetic heart disease associated with life-threatening ventricular arrhythmias. Diagnosis of ARVC is based on the 2010 Task Force Criteria (TFC), application of which often requires clinical expertise at specialized centers. OBJECTIVE: The purpose of this study was to develop and validate an electrocardiogram (ECG) deep learning (DL) tool for ARVC diagnosis. METHODS: ECGs of patients referred for ARVC evaluation were used to develop (n = 551 [80.1%]) and test (n = 137 [19.9%]) an ECG-DL model for prediction of TFC-defined ARVC diagnosis. The ARVC ECG-DL model was externally validated in a cohort of patients with pathogenic or likely pathogenic (P/LP) ARVC gene variants identified through the Geisinger MyCode Community Health Initiative (N = 167). RESULTS: Of 688 patients evaluated at Johns Hopkins Hospital (JHH) (57.3% male, mean age 40.2 years), 329 (47.8%) were diagnosed with ARVC. Although ARVC diagnosis made by referring cardiologist ECG interpretation was unreliable (c-statistic 0.53; confidence interval [CI] 0.52-0.53), ECG-DL discrimination in the hold-out testing cohort was excellent (0.87; 0.86-0.89) and compared favorably to that of ECG interpretation by an ARVC expert (0.85; 0.84-0.86). In the Geisinger cohort, prevalence of ARVC was lower (n = 17 [10.2%]), but ECG-DL-based identification of ARVC phenotype remained reliable (0.80; 0.77-0.83). Discrimination was further increased when ECG-DL predictions were combined with non-ECG-derived TFC in the JHH testing (c-statistic 0.940; 95% CI 0.933-0.948) and Geisinger validation (0.897; 95% CI 0.883-0.912) cohorts. CONCLUSION: ECG-DL augments diagnosis of ARVC to the level of an ARVC expert and can differentiate true ARVC diagnosis from phenotype-mimics and at-risk family members/genotype-positive individuals.

13.
Sensors (Basel) ; 24(16)2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39205010

RESUMEN

With the rapid development of industry, the risks factories face are increasing. Therefore, the anomaly detection algorithms deployed in factories need to have high accuracy, and they need to be able to promptly discover and locate the specific equipment causing the anomaly to restore the regular operation of the abnormal equipment. However, the neural network models currently deployed in factories cannot effectively capture both temporal features within dimensions and relationship features between dimensions; some algorithms that consider both types of features lack interpretability. Therefore, we propose a high-precision, interpretable anomaly detection algorithm based on variational autoencoder (VAE). We use a multi-scale local weight-sharing convolutional neural network structure to fully extract the temporal features within each dimension of the multi-dimensional time series. Then, we model the features from various aspects through multiple attention heads, extracting the relationship features between dimensions. We map the attention output results to the latent space distribution of the VAE and propose an optimization method to improve the reconstruction performance of the VAE, detecting anomalies through reconstruction errors. Regarding anomaly interpretability, we utilize the VAE probability distribution characteristics, decompose the obtained joint probability density into conditional probabilities on each dimension, and calculate the anomaly score, which provides helpful value for technicians. Experimental results show that our algorithm performed best in terms of F1 score and AUC value. The AUC value for anomaly detection is 0.982, and the F1 score is 0.905, which is 4% higher than the best-performing baseline algorithm, Transformer with a Discriminator for Anomaly Detection (TDAD). It also provides accurate anomaly interpretation capability.

14.
J Bioinform Comput Biol ; 22(4): 2450018, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39215523

RESUMEN

Circular RNAs (circRNAs) are endogenous non-coding RNAs with a covalently closed loop structure. They have many biological functions, mainly regulatory ones. They have been proven to modulate protein-coding genes in the human genome. CircRNAs are linked to various diseases like Alzheimer's disease, diabetes, atherosclerosis, Parkinson's disease and cancer. Identifying the associations between circular RNAs and diseases is essential for disease diagnosis, prevention, and treatment. The proposed model, based on the variational autoencoder and genetic algorithm circular RNA disease association (VAGA-CDA), predicts novel circRNA-disease associations. First, the experimentally verified circRNA-disease associations are augmented with the synthetic minority oversampling technique (SMOTE) and regenerated using a variational autoencoder, and feature selection is applied to these vectors by a genetic algorithm (GA). The variational autoencoder effectively extracts features from the augmented samples. The optimized feature selection of the genetic algorithm effectively carried out dimensionality reduction. The sophisticated feature vectors extracted are then given to a Random Forest classifier to predict new circRNA-disease associations. The proposed model yields an AUC value of 0.9644 and 0.9628 under 5-fold and 10-fold cross-validations, respectively. The results of the case studies indicate the robustness of the proposed model.


Asunto(s)
Algoritmos , Biología Computacional , ARN Circular , ARN Circular/genética , Humanos , Biología Computacional/métodos , Predisposición Genética a la Enfermedad/genética , Enfermedad de Alzheimer/genética , ARN/genética
15.
Interact J Med Res ; 13: e53672, 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39133916

RESUMEN

BACKGROUND: Mental disorders have ranked among the top 10 prevalent causes of burden on a global scale. Generative artificial intelligence (GAI) has emerged as a promising and innovative technological advancement that has significant potential in the field of mental health care. Nevertheless, there is a scarcity of research dedicated to examining and understanding the application landscape of GAI within this domain. OBJECTIVE: This review aims to inform the current state of GAI knowledge and identify its key uses in the mental health domain by consolidating relevant literature. METHODS: Records were searched within 8 reputable sources including Web of Science, PubMed, IEEE Xplore, medRxiv, bioRxiv, Google Scholar, CNKI and Wanfang databases between 2013 and 2023. Our focus was on original, empirical research with either English or Chinese publications that use GAI technologies to benefit mental health. For an exhaustive search, we also checked the studies cited by relevant literature. Two reviewers were responsible for the data selection process, and all the extracted data were synthesized and summarized for brief and in-depth analyses depending on the GAI approaches used (traditional retrieval and rule-based techniques vs advanced GAI techniques). RESULTS: In this review of 144 articles, 44 (30.6%) met the inclusion criteria for detailed analysis. Six key uses of advanced GAI emerged: mental disorder detection, counseling support, therapeutic application, clinical training, clinical decision-making support, and goal-driven optimization. Advanced GAI systems have been mainly focused on therapeutic applications (n=19, 43%) and counseling support (n=13, 30%), with clinical training being the least common. Most studies (n=28, 64%) focused broadly on mental health, while specific conditions such as anxiety (n=1, 2%), bipolar disorder (n=2, 5%), eating disorders (n=1, 2%), posttraumatic stress disorder (n=2, 5%), and schizophrenia (n=1, 2%) received limited attention. Despite prevalent use, the efficacy of ChatGPT in the detection of mental disorders remains insufficient. In addition, 100 articles on traditional GAI approaches were found, indicating diverse areas where advanced GAI could enhance mental health care. CONCLUSIONS: This study provides a comprehensive overview of the use of GAI in mental health care, which serves as a valuable guide for future research, practical applications, and policy development in this domain. While GAI demonstrates promise in augmenting mental health care services, its inherent limitations emphasize its role as a supplementary tool rather than a replacement for trained mental health providers. A conscientious and ethical integration of GAI techniques is necessary, ensuring a balanced approach that maximizes benefits while mitigating potential challenges in mental health care practices.

16.
Sci Rep ; 14(1): 18451, 2024 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-39117712

RESUMEN

As a class of biologically active molecules with significant immunomodulatory and anti-inflammatory effects, anti-inflammatory peptides have important application value in the medical and biotechnology fields due to their unique biological functions. Research on the identification of anti-inflammatory peptides provides important theoretical foundations and practical value for a deeper understanding of the biological mechanisms of inflammation and immune regulation, as well as for the development of new drugs and biotechnological applications. Therefore, it is necessary to develop more advanced computational models for identifying anti-inflammatory peptides. In this study, we propose a deep learning model named DAC-AIPs based on variational autoencoder and contrastive learning for accurate identification of anti-inflammatory peptides. In the sequence encoding part, the incorporation of multi-hot encoding helps capture richer sequence information. The autoencoder, composed of convolutional layers and linear layers, can learn latent features and reconstruct features, with variational inference enhancing the representation capability of latent features. Additionally, the introduction of contrastive learning aims to improve the model's classification ability. Through cross-validation and independent dataset testing experiments, DAC-AIPs achieves superior performance compared to existing state-of-the-art models. In cross-validation, the classification accuracy of DAC-AIPs reached around 88%, which is 7% higher than previous models. Furthermore, various ablation experiments and interpretability experiments validate the effectiveness of DAC-AIPs. Finally, a user-friendly online predictor is designed to enhance the practicality of the model, and the server is freely accessible at http://dac-aips.online .


Asunto(s)
Antiinflamatorios , Aprendizaje Profundo , Péptidos , Péptidos/química , Humanos
17.
Diagnostics (Basel) ; 14(14)2024 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-39061703

RESUMEN

Pneumonia ranks among the most prevalent lung diseases and poses a significant concern since it is one of the diseases that may lead to death around the world. Diagnosing pneumonia necessitates a chest X-ray and substantial expertise to ensure accurate assessments. Despite the critical role of lateral X-rays in providing additional diagnostic information alongside frontal X-rays, they have not been widely used. Obtaining X-rays from multiple perspectives is crucial, significantly improving the precision of disease diagnosis. In this paper, we propose a multi-view multi-feature fusion model (MV-MFF) that integrates latent representations from a variational autoencoder and a ß-variational autoencoder. Our model aims to classify pneumonia presence using multi-view X-rays. Experimental results demonstrate that the MV-MFF model achieves an accuracy of 80.4% and an area under the curve of 0.775, outperforming current state-of-the-art methods. These findings underscore the efficacy of our approach in improving pneumonia diagnosis through multi-view X-ray analysis.

18.
Comput Biol Med ; 179: 108813, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38955127

RESUMEN

BACKGROUND: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. METHOD: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-scale variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. RESULTS: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R2-scores > 0.01 for 71.55 % of metabolites. CONCLUSION: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.


Asunto(s)
Metabolómica , Polimorfismo de Nucleótido Simple , Secuenciación Completa del Genoma , Humanos , Metabolómica/métodos , Desequilibrio de Ligamiento
19.
Sci Rep ; 14(1): 17444, 2024 07 29.
Artículo en Inglés | MEDLINE | ID: mdl-39075127

RESUMEN

The clock drawing test (CDT) is a neuropsychological assessment tool to screen an individual's cognitive ability. In this study, we developed a Fair and Interpretable Representation of Clock drawing test (FaIRClocks) to evaluate and mitigate classification bias against people with less than 8 years of education, while screening their cognitive function using an array of neuropsychological measures. In this study, we represented clock drawings by a priorly published 10-dimensional deep learning feature set trained on publicly available data from the National Health and Aging Trends Study (NHATS). These embeddings were further fine-tuned with clocks from a preoperative cognitive screening program at the University of Florida to predict three cognitive scores: the Mini-Mental State Examination (MMSE) total score, an attention composite z-score (ATT-C), and a memory composite z-score (MEM-C). ATT-C and MEM-C scores were developed by averaging z-scores based on normative references. The cognitive screening classifiers were initially tested to see their relative performance in patients with low years of education (< = 8 years) versus patients with higher education (> 8 years) and race. Results indicated that the initial unweighted classifiers confounded lower education with cognitive compromise resulting in a 100% type I error rate for this group. Thereby, the samples were re-weighted using multiple fairness metrics to achieve sensitivity/specificity and positive/negative predictive value (PPV/NPV) balance across groups. In summary, we report the FaIRClocks model, with promise to help identify and mitigate bias against people with less than 8 years of education during preoperative cognitive screening.


Asunto(s)
Escolaridad , Racismo , Humanos , Masculino , Femenino , Anciano , Pruebas Neuropsicológicas , Cognición/fisiología , Disfunción Cognitiva/diagnóstico , Anciano de 80 o más Años , Pruebas de Estado Mental y Demencia , Persona de Mediana Edad , Aprendizaje Profundo
20.
Front Psychiatry ; 15: 1397093, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38832332

RESUMEN

Background: Resting state Functional Magnetic Resonance Imaging fMRI (rs-fMRI) has been used extensively to study brain function in psychiatric disorders, yielding insights into brain organization. However, the high dimensionality of the rs-fMRI data presents significant challenges for data analysis. Variational autoencoders (VAEs), a type of neural network, have been instrumental in extracting low-dimensional latent representations of resting state functional connectivity (rsFC) patterns, thereby addressing the complex nonlinear structure of rs-fMRI data. Despite these advances, interpreting these latent representations remains a challenge. This paper aims to address this gap by developing explainable VAE models and testing their utility using rs-fMRI data in autism spectrum disorder (ASD). Methods: One-thousand one hundred and fifty participants (601 healthy controls [HC] and 549 patients with ASD) were included in the analysis. RsFC correlation matrices were extracted from the preprocessed rs-fMRI data using the Power atlas, which includes 264 regions of interest (ROIs). Then VAEs were trained in an unsupervised manner. Lastly, we introduce our latent contribution scores to explain the relationship between estimated representations and the original rs-fMRI brain measures. Results: We quantified the latent contribution scores for both the ASD and HC groups at the network level. We found that both ASD and HC groups share the top network connectivitives contributing to all estimated latent components. For example, latent 0 was driven by rsFC within ventral attention network (VAN) in both the ASD and HC. However, we found significant differences in the latent contribution scores between the ASD and HC groups within the VAN for latent 0 and the sensory/somatomotor network for latent 2. Conclusion: This study introduced latent contribution scores to interpret nonlinear patterns identified by VAEs. These scores effectively capture changes in each observed rsFC feature as the estimated latent representation changes, enabling an explainable deep learning model that better understands the underlying neural mechanisms of ASD.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA