Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Methods Mol Biol ; 2812: 1-9, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39068354

RESUMEN

In this chapter, we present an established pipeline for analyzing RNA-Seq data, which involves a step-by-step flow starting from raw data obtained from a sequencer and culminating in the identification of differentially expressed genes with their functional characterization. The pipeline is divided into three sections, each addressing crucial stages of the analysis process. The first section covers the initial steps of the pipeline, including downloading of the data of interest and performing quality control assessment. Assessment ensures that the data used for analysis is reliable and suitable for downstream analyses. In the second section, gene-level quantification is performed, which entails quantification of expression levels of genes in the samples. The third and final section is focused on differential expression analysis, which involves comparing gene expression levels between two or more conditions. This step helps identify genes that show significant differences in expression levels under different experimental conditions. To facilitate accessibility and reproducibility, we have provided an online repository containing all scripts and files. Additionally, custom scripts are available, enabling users to modify the pipeline's output for various downstream analyses. By following this pipeline, researchers can effectively analyze RNA-Seq data and gain valuable insights into gene expression patterns and, furthermore, the understanding of biological processes.


Asunto(s)
Perfilación de la Expresión Génica , RNA-Seq , Programas Informáticos , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Humanos , Análisis de Secuencia de ARN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reproducibilidad de los Resultados , Control de Calidad , Análisis de Datos , Transcriptoma/genética
2.
Methods Mol Biol ; 2812: 115-141, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39068359

RESUMEN

RNA sequencing is an approach to transcriptomic profiling that enables the detection of differentially expressed genes in response to genetic mutation or experimental treatment, among other uses. Here we describe a method for the use of a customizable, user-friendly bioinformatic pipeline to identify differentially expressed genes in RNA sequencing data obtained from C. elegans, with attention to the improvement in reproducibility and accuracy of results.


Asunto(s)
Caenorhabditis elegans , Biología Computacional , Perfilación de la Expresión Génica , Análisis de Secuencia de ARN , Programas Informáticos , Flujo de Trabajo , Caenorhabditis elegans/genética , Animales , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reproducibilidad de los Resultados
3.
Methods Mol Biol ; 2812: 367-377, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39068373

RESUMEN

A protein, which can attain a prion state, differs from standard proteins in terms of structural conformation and aggregation propensity. High-throughput sequencing technology provides an opportunity to gain insight into the prion disease condition when coupled with single-cell RNA-Seq analysis to reveal transcriptional changes during prion-based pathogenicity. In this chapter, we present a protocol for RNA-Seq analysis of mammalian prion disease using a single-cell RNA sequencing dataset procured from the NCBI GEO database. This protocol is a tool that can assist researchers in characterizing mammalian prion disease in a reproducible and reusable manner. Further, the resulting output has the potential to provide transcript biomarkers for mammalian prion diseases, which can be employed for diagnostic and prognostic purposes.


Asunto(s)
Enfermedades por Prión , Animales , Enfermedades por Prión/genética , Humanos , RNA-Seq/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mamíferos/genética , Análisis de la Célula Individual/métodos , Priones/genética , Priones/metabolismo , Análisis de Secuencia de ARN/métodos
4.
Microorganisms ; 12(6)2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38930440

RESUMEN

COVID-19, caused by SARS-CoV-2, results in respiratory and cardiopulmonary infections. There is an urgent need to understand not just the pathogenic mechanisms of this disease but also its impact on the physiology of different organs and microbiomes. Multiple studies have reported the effects of COVID-19 on the gastrointestinal microbiota, such as promoting dysbiosis (imbalances in the microbiome) following the disease's progression. Deconstructing the dynamic changes in microbiome composition that are specifically correlated with COVID-19 patients remains a challenge. Motivated by this problem, we implemented a biomarker discovery pipeline to identify candidate microbes specific to COVID-19. This involved a meta-analysis of large-scale COVID-19 metagenomic data to decipher the impact of COVID-19 on the human gut and respiratory microbiomes. Metagenomic studies of the gut and respiratory microbiomes of COVID-19 patients and of microbiomes from other respiratory diseases with symptoms similar to or overlapping with COVID-19 revealed 1169 and 131 differentially abundant microbes in the human gut and respiratory microbiomes, respectively, that uniquely associate with COVID-19. Furthermore, by utilizing machine learning models (LASSO and XGBoost), we demonstrated the power of microbial features in separating COVID-19 samples from metagenomic samples representing other respiratory diseases and controls (healthy individuals), achieving an overall accuracy of over 80%. Overall, our study provides insights into the microbiome shifts occurring in COVID-19 patients, shining a new light on the compositional changes.

5.
J Biotechnol ; 388: 49-58, 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38641137

RESUMEN

Mobilization of clusters of genes called genomic islands (GIs) across bacterial lineages facilitates dissemination of traits, such as, resistance against antibiotics, virulence or hypervirulence, and versatile metabolic capabilities. Robust delineation of GIs is critical to understanding bacterial evolution that has a vast impact on different life forms. Methods for identification of GIs exploit different evolutionary features or signals encoded within the genomes of bacteria, however, the current state-of-the-art in GI detection still leaves much to be desired. Here, we have taken a combinatorial approach that accounted for GI specific features such as compositional bias, aberrant phyletic pattern, and marker gene enrichment within an integrative framework to delineate GIs in bacterial genomes. Our GI prediction tool, DICEP, was assessed on simulated genomes and well-characterized bacterial genomes. DICEP compared favorably with current GI detection tools on real and synthetic datasets.


Asunto(s)
Genoma Bacteriano , Islas Genómicas , Islas Genómicas/genética , Genoma Bacteriano/genética , Bacterias/genética , Genómica/métodos , Filogenia , Programas Informáticos , Biología Computacional/métodos
6.
Plants (Basel) ; 13(5)2024 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-38475429

RESUMEN

The utmost goal of selecting an RNA-Seq alignment software is to perform accurate alignments with a robust algorithm, which is capable of detecting the various intricacies underlying read-mapping procedures and beyond. Most alignment software tools are typically pre-tuned with human or prokaryotic data, and therefore may not be suitable for applications to other organisms, such as plants. The rapidly growing plant RNA-Seq databases call for the assessment of the alignment tools on curated plant data, which will aid the calibration of these tools for applications to plant transcriptomic data. We therefore focused here on benchmarking RNA-Seq read alignment tools, using simulated data derived from the model organism Arabidopsis thaliana. We assessed the performance of five popular RNA-Seq alignment tools that are currently available, based on their usage (citation count). By introducing annotated single nucleotide polymorphisms (SNPs) from The Arabidopsis Information Resource (TAIR), we recorded alignment accuracy at both base-level and junction base-level resolutions for each alignment tool. In addition to assessing the performance of the alignment tools at their default settings, accuracies were also recorded by varying the values of numerous parameters, including the confidence threshold and the level of SNP introduction. The performances of the aligners were found consistent under various testing conditions at the base-level accuracy; however, the junction base-level assessment produced varying results depending upon the applied algorithm. At the read base-level assessment, the overall performance of the aligner STAR was superior to other aligners, with the overall accuracy reaching over 90% under different test conditions. On the other hand, at the junction base-level assessment, SubRead emerged as the most promising aligner, with an overall accuracy over 80% under most test conditions.

7.
BMC Bioinformatics ; 25(1): 107, 2024 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-38468193

RESUMEN

As noncommunicable diseases (NCDs) pose a significant global health burden, identifying effective diagnostic and predictive markers for these diseases is of paramount importance. Epigenetic modifications, such as DNA methylation, have emerged as potential indicators for NCDs. These have previously been exploited in other contexts within the framework of neural network models that capture complex relationships within the data. Applications of neural networks have led to significant breakthroughs in various biological or biomedical fields but these have not yet been effectively applied to NCD modeling. This is, in part, due to limited datasets that are not amenable to building of robust neural network models. In this work, we leveraged a neural network trained on one class of NCDs, cancer, as the basis for a transfer learning approach to non-cancer NCD modeling. Our results demonstrate promising performance of the model in predicting three NCDs, namely, arthritis, asthma, and schizophrenia, for the respective blood samples, with an overall accuracy (f-measure) of 94.5%. Furthermore, a concept based explanation method called Testing with Concept Activation Vectors (TCAV) was used to investigate the importance of the sample sources and understand how future training datasets for multiple NCD models may be improved. Our findings highlight the effectiveness of transfer learning in developing accurate diagnostic and predictive models for NCDs.


Asunto(s)
Enfermedades no Transmisibles , Humanos , Redes Neurales de la Computación , Aprendizaje Automático
8.
Plant J ; 117(1): 72-91, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37753661

RESUMEN

Lipocalins constitute a conserved protein family that binds to and transports a variety of lipids while fatty acid desaturases (FADs) are required for maintaining the cell membrane fluidity under cold stress. Nevertheless, it remains unclear whether plant lipocalins promote FADs for the cell membrane integrity under cold stress. Here, we identified the role of OsTIL1 lipocalin in FADs-mediated glycerolipid remodeling under cold stress. Overexpression and CRISPR/Cas9 mediated gene edition experiments demonstrated that OsTIL1 positively regulated cold stress tolerance by protecting the cell membrane integrity from reactive oxygen species damage and enhancing the activities of peroxidase and ascorbate peroxidase, which was confirmed by combined cold stress with a membrane rigidifier dimethyl sulfoxide or a H2 O2 scavenger dimethyl thiourea. OsTIL1 overexpression induced higher 18:3 content, and higher 18:3/18:2 and (18:2 + 18:3)/18:1 ratios than the wild type under cold stress whereas the gene edition mutant showed the opposite. Furthermore, the lipidomic analysis showed that OsTIL1 overexpression led to higher contents of 18:3-mediated glycerolipids, including galactolipids (monoglactosyldiacylglycerol and digalactosyldiacylglycerol) and phospholipids (phosphatidyl glycerol, phosphatidyl choline, phosphatidyl ethanolamine, phosphatidyl serine and phosphatidyl inositol) under cold stress. RNA-seq and enzyme linked immunosorbent assay analyses indicated that OsTIL1 overexpression enhanced the transcription and enzyme abundance of four ω-3 FADs (OsFAD3-1/3-2, 7, and 8) under cold stress. These results reveal an important role of OsTIL1 in maintaining the cell membrane integrity from oxidative damage under cold stress, providing a good candidate gene for improving cold tolerance in rice.


Asunto(s)
Respuesta al Choque por Frío , Oryza , Especies Reactivas de Oxígeno/metabolismo , Oryza/metabolismo , Estrés Oxidativo , Membrana Celular/metabolismo , Frío , Regulación de la Expresión Génica de las Plantas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas Modificadas Genéticamente/genética
9.
mSystems ; 8(6): e0047323, 2023 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-37921470

RESUMEN

IMPORTANCE: We present here a new systems-level approach to decipher genetic factors and biological pathways associated with virulence and/or antibiotic treatment of bacterial pathogens. The power of this approach was demonstrated by application to a well-studied pathogen Pseudomonas aeruginosa PAO1. Our gene co-expression network-based approach unraveled known and unknown genes and their networks associated with pathogenicity in P. aeruginosa PAO1. The systems-level investigation of P. aeruginosa PAO1 helped identify putative pathogenicity and resistance-associated genetic factors that could not otherwise be detected by conventional approaches of differential gene expression analysis. The network-based analysis uncovered modules that harbor genes not previously reported by several original studies on P. aeruginosa virulence and resistance. These could potentially act as molecular determinants of P. aeruginosa PAO1 pathogenicity and responses to antibiotics.


Asunto(s)
Infecciones por Pseudomonas , Pseudomonas aeruginosa , Humanos , Pseudomonas aeruginosa/genética , Virulencia/genética , Redes Reguladoras de Genes/genética , Factores de Virulencia/genética , Infecciones por Pseudomonas/tratamiento farmacológico
10.
Antibiotics (Basel) ; 12(11)2023 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-37998806

RESUMEN

In his 1945 Nobel Prize acceptance speech, Sir Alexander Fleming warned of antimicrobial resistance (AMR) if the necessary precautions were not taken diligently. As the growing threat of AMR continues to loom over humanity, we must look forward to alternative diagnostic tools and preventive measures to thwart looming economic collapse and untold mortality worldwide. The integration of machine learning (ML) methodologies within the framework of such tools/pipelines presents a promising avenue, offering unprecedented insights into the underlying mechanisms of resistance and enabling the development of more targeted and effective treatments. This paper explores the applications of ML in predicting and understanding AMR, highlighting its potential in revolutionizing healthcare practices. From the utilization of supervised-learning approaches to analyze genetic signatures of antibiotic resistance to the development of tools and databases, such as the Comprehensive Antibiotic Resistance Database (CARD), ML is actively shaping the future of AMR research. However, the successful implementation of ML in this domain is not without challenges. The dependence on high-quality data, the risk of overfitting, model selection, and potential bias in training data are issues that must be systematically addressed. Despite these challenges, the synergy between ML and biomedical research shows great promise in combating the growing menace of antibiotic resistance.

11.
Microorganisms ; 11(11)2023 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-38004767

RESUMEN

Since the discovery of the second chromosome in the Rhodobacter sphaeroides 2.4.1 by Suwanto and Kaplan in 1989 and the revelation of gene sequences, multipartite genomes have been reported in over three hundred bacterial species under nine different phyla. This phenomenon shattered the dogma of a unipartite genome (a single circular chromosome) in bacteria. Recently, Artificial Intelligence (AI), machine learning (ML), and Deep Learning (DL) have emerged as powerful tools in the investigation of big data in a plethora of disciplines to decipher complex patterns in these data, including the large-scale analysis and interpretation of genomic data. An important inquiry in bacteriology pertains to the genetic factors that underlie the structural evolution of multipartite and unipartite bacterial species. Towards this goal, here we have attempted to leverage machine learning as a means to identify the genetic factors that underlie the differentiation of, in general, bacteria with multipartite genomes and bacteria with unipartite genomes. In this study, deploying ML algorithms yielded two gene lists of interest: one that contains 46 discriminatory genes obtained following an assessment on all gene sets, and another that contains 35 discriminatory genes obtained based on an investigation of genes that are differentially present (or absent) in the genomes of the multipartite bacteria and their respective close relatives. Our study revealed a small pool of genes that discriminate bacteria with multipartite genomes and their close relatives with single-chromosome genomes. Machine learning thus aided in uncovering the genetic factors that underlie the differentiation of bacterial multipartite and unipartite traits.

12.
Microorganisms ; 11(10)2023 Oct 02.
Artículo en Inglés | MEDLINE | ID: mdl-37894136

RESUMEN

Taxonomic profiling of ancient metagenomic samples is challenging due to the accumulation of specific damage patterns on DNA over time. Although a number of methods for metagenome profiling have been developed, most of them have been assessed on modern metagenomes or simulated metagenomes mimicking modern metagenomes. Further, a comparative assessment of metagenome profilers on simulated metagenomes representing a spectrum of degradation depth, from the extremity of ancient (most degraded) to current or modern (not degraded) metagenomes, has not yet been performed. To understand the strengths and weaknesses of different metagenome profilers, we performed their comprehensive evaluation on simulated metagenomes representing human dental calculus microbiome, with the level of DNA damage successively raised to mimic modern to ancient metagenomes. All classes of profilers, namely, DNA-to-DNA, DNA-to-protein, and DNA-to-marker comparison-based profilers were evaluated on metagenomes with varying levels of damage simulating deamination, fragmentation, and contamination. Our results revealed that, compared to deamination and fragmentation, human and environmental contamination of ancient DNA (with modern DNA) has the most pronounced effect on the performance of each profiler. Further, the DNA-to-DNA (e.g., Kraken2, Bracken) and DNA-to-marker (e.g., MetaPhlAn4) based profiling approaches showed complementary strengths, which can be leveraged to elevate the state-of-the-art of ancient metagenome profiling.

13.
Environ Microbiome ; 18(1): 16, 2023 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-36890583

RESUMEN

We present here POSMM (pronounced 'Possum'), Python-Optimized Standard Markov Model classifier, which is a new incarnation of the Markov model approach to metagenomic sequence analysis. Built on the top of a rapid Markov model based classification algorithm SMM, POSMM reintroduces high sensitivity associated with alignment-free taxonomic classifiers to probe whole genome or metagenome datasets of increasingly prohibitive sizes. Logistic regression models generated and optimized using the Python sklearn library, transform Markov model probabilities to scores suitable for thresholding. Featuring a dynamic database-free approach, models are generated directly from genome fasta files per run, making POSMM a valuable accompaniment to many other programs. By combining POSMM with ultrafast classifiers such as Kraken2, their complementary strengths can be leveraged to produce higher overall accuracy in metagenomic sequence classification than by either as a standalone classifier. POSMM is a user-friendly and highly adaptable tool designed for broad use by the metagenome scientific community.

14.
Microb Genom ; 9(1)2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36748570

RESUMEN

A significant challenge in bacterial genomics is to catalogue genes acquired through the evolutionary process of horizontal gene transfer (HGT). Both comparative genomics and sequence composition-based methods have often been invoked to quantify horizontally acquired genes in bacterial genomes. Comparative genomics methods rely on completely sequenced genomes and therefore the confidence in their predictions increases as the databases become more enriched in completely sequenced genomes. Recent developments including in microbial genome sequencing call for reassessment of alien genes based on information-rich resources currently available. We revisited the comparative genomics approach and developed a new algorithm for alien gene detection. Our algorithm compared favourably with the existing comparative genomics-based methods and is capable of detecting both recent and ancient transfers. It can be used as a standalone tool or in concert with other complementary algorithms for comprehensively cataloguing alien genes in bacterial genomes.


Asunto(s)
Genoma Bacteriano , Genómica , Genómica/métodos , Algoritmos , Evolución Biológica
15.
New Phytol ; 237(5): 1711-1727, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36401805

RESUMEN

Reactive oxygen species (ROS) and the photoreceptor protein phytochrome B (phyB) play a key role in plant acclimation to stress. However, how phyB that primarily functions in the nuclei impacts ROS signaling mediated by respiratory burst oxidase homolog (RBOH) proteins that reside on the plasma membrane, during stress, is unknown. Arabidopsis thaliana and Oryza sativa mutants, RNA-Seq, bioinformatics, biochemistry, molecular biology, and whole-plant ROS imaging were used to address this question. Here, we reveal that phyB and RBOHs function as part of a key regulatory module that controls apoplastic ROS production, stress-response transcript expression, and plant acclimation in response to excess light stress. We further show that phyB can regulate ROS production during stress even if it is restricted to the cytosol and that phyB, respiratory burst oxidase protein D (RBOHD), and respiratory burst oxidase protein F (RBOHF) coregulate thousands of transcripts in response to light stress. Surprisingly, we found that phyB is also required for ROS accumulation in response to heat, wounding, cold, and bacterial infection. Our findings reveal that phyB plays a canonical role in plant responses to biotic and abiotic stresses, regulating apoplastic ROS production, possibly while at the cytosol, and that phyB and RBOHD/RBOHF function in the same regulatory pathway.


Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Proteínas de Arabidopsis/metabolismo , Fitocromo B/genética , Fitocromo B/metabolismo , Oxígeno/metabolismo , Especies Reactivas de Oxígeno/metabolismo , Arabidopsis/metabolismo , Estrés Fisiológico , Regulación de la Expresión Génica de las Plantas
16.
Sci Rep ; 12(1): 22058, 2022 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-36543855

RESUMEN

SARS-CoV-2 is the causative agent of COVID-19 that has infected over 642 million and killed over 6.6 million people around the globe. Underlying a wide range of clinical manifestations of this disease, from moderate to extremely severe systemic conditions, could be genes or pathways differentially expressing in the hosts. It is therefore important to gain insights into pathways involved in COVID-19 pathogenesis and host defense and thus understand the host response to this pathogen at the physiological and molecular level. To uncover genes and pathways involved in the differential clinical manifestations of this disease, we developed a novel gene co-expression network based pipeline that uses gene expression obtained from different SARS-CoV-2 infected human tissues. We leveraged the network to identify novel genes or pathways that likely differentially express and could be physiologically significant in the COVID-19 pathogenesis and progression but were deemed statistically non-significant and therefore not further investigated in the original studies. Our network-based approach aided in the identification of co-expression modules enriched in differentially expressing genes (DEGs) during different stages of COVID-19 and enabled discovery of novel genes involved in the COVID-19 pathogenesis, by virtue of their transcript abundance and association with genes expressing differentially in modules enriched in DEGs. We further prioritized by considering only those enriched gene modules that have most of their genes differentially expressed, inferred by the original studies or this study, and document here 7 novel genes potentially involved in moderate, 2 in severe, 48 in extremely severe COVID-19, and 96 novel genes involved in the progression of COVID-19 from severe to extremely severe conditions. Our study shines a new light on genes and their networks (modules) that drive the progression of COVID-19 from moderate to extremely severe condition. These findings could aid development of new therapeutics to combat COVID-19.


Asunto(s)
COVID-19 , Humanos , COVID-19/genética , SARS-CoV-2/genética , Redes Reguladoras de Genes
17.
Arch Microbiol ; 205(1): 25, 2022 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-36515719

RESUMEN

Since the discovery of second chromosome in Rhodobacter sphaeroides 2.4.1 in 1989, multipartite genomes have been reported in over three hundred bacterial species under nine different phyla. This has shattered the unipartite (single chromosome) genome dogma in bacteria. Since then, many questions on various aspects of multipartite genomes in bacteria have been addressed. However, our understanding of how multipartite genomes emerge and evolve is still lacking. Importantly, the knowledge of genetic factors underlying the differences in multipartite and single-chromosome genomes is lacking. In this work, we have performed comparative evolutionary and functional genomics analyses to identify molecular factors that discriminate multipartite from unipartite bacteria, with the goal to decipher taxon-specific factors, and those that are prevalent across the taxa, underlying these traits. We assessed the roles of evolutionary mechanisms, specifically gene gain, in driving the divergence of bacteria with single and multiple chromosomes. In addition, we performed functional genomic analysis to garner support for our findings from comparative evolutionary analysis. We found genes such as those encoding conserved hypothetical proteins in Deinococcus radiodurans R1, and putative phage phi-C31 gp36 major capsid like and hypothetical proteins in Rhodobacter sphaeroides 2.4.1, which are located on accessory chromosomes in these bacteria but were not found in the inferred ancestral sequences, and on the primary chromosomes, as well as were not found in their closest relatives with single chromosome within the same clade. Our study shines a new light on the potential roles of the secondary chromosomes in helping bacteria with multipartite genomes to adapt to specialized environments or growth conditions.


Asunto(s)
Genoma Bacteriano , Rhodobacter sphaeroides , Genómica , Evolución Biológica , Rhodobacter sphaeroides/genética , Evolución Molecular , Cromosomas Bacterianos/genética
18.
Microorganisms ; 10(11)2022 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-36363694

RESUMEN

Antimicrobial resistance (AMR) threatens the healthcare system worldwide with the rise of emerging drug resistant infectious agents. AMR may render the current therapeutics ineffective or diminish their efficacy, and its rapid dissemination can have unmitigated health and socioeconomic consequences. Just like with many other health problems, recent computational advances including developments in machine learning or artificial intelligence hold a prodigious promise in deciphering genetic factors underlying emergence and dissemination of AMR and in aiding development of therapeutics for more efficient AMR solutions. Current machine learning frameworks focus mainly on known AMR genes and are, therefore, prone to missing genes that have not been implicated in resistance yet, including many uncharacterized genes whose functions have not yet been elucidated. Furthermore, new resistance traits may evolve from these genes leading to the rise of superbugs, and therefore, these genes need to be characterized. To infer novel resistance genes, we used complete gene sets of several bacterial strains known to be susceptible or resistant to specific drugs and associated phenotypic information within a machine learning framework that enabled prioritizing genes potentially involved in resistance. Further, homology modeling of proteins encoded by prioritized genes and subsequent molecular docking studies indicated stable interactions between these proteins and the antimicrobials that the strains containing these proteins are known to be resistant to. Our study highlights the capability of a machine learning framework to uncover novel genes that have not yet been implicated in resistance to any antimicrobials and thus could spur further studies targeted at neutralizing AMR.

19.
Open Biol ; 12(11): 220169, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36446404

RESUMEN

Horizontal gene transfer (HGT) is a major source of phenotypic innovation and a mechanism of niche adaptation in prokaryotes. Quantification of HGT is critical to decipher its myriad roles in microbial evolution and adaptation. Advances in genome sequencing and bioinformatics have augmented our ability to understand the microbial world, particularly the direct or indirect influence of HGT on diverse life forms. Methods for detecting HGT can be classified into phylogenetic-based and parametric or composition-based approaches. Here, we exploited the complementary strengths of both the approaches to construct a high confidence horizontal gene flow network. Our network is unique in its ability to detect the transfer of native genes of a genome to genomes from other taxa, thus establishing donor and recipient organisms (taxa), rather than through a post hoc analysis as is the practice with several other approaches. The scale-free horizontal gene flow network presented here provides new insights into modes of transfer for the exchange of genetic information and also illuminates differential gene flow across phyla.


Asunto(s)
Flujo Génico , Células Procariotas , Filogenia , Redes Reguladoras de Genes , Biología Computacional
20.
OMICS ; 26(8): 422-439, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35925817

RESUMEN

Bacterial genomes are chimeras of DNA of different ancestries. Deconstructing chimeric genomes is central to understanding the evolutionary trajectories of their disparate components and thus the organisms as a whole in the light of their evolutionary contexts. Of specific interest is to delineate and quantify native (vertically inherited) and alien (horizontally acquired) components of bacterial genomes and also specify genomic fractions that represent different donor sources. An agglomerative clustering procedure that prioritizes grouping of proximal similar genomic segments has previously been invoked for this purpose in conjunction with a recursive segmentation procedure. Surprisingly, however, the relative strengths and weaknesses of different clustering approaches to deciphering bacterial chimerism have not yet been investigated, despite the need to robustly interpret tens of thousands of completely sequenced bacterial genomes and nearly complete genome assemblies available in the public databases. To bridge this knowledge gap and develop more robust approaches, we assessed different clustering methods, including segment order based (proximal) clustering, hierarchical clustering, affinity propagation clustering, and a novel network clustering approach on chimeric genomes modeled after bacterial genomes representing a broad spectrum of compositional complexity. Although segment order-based clustering and network clustering compared favorably with the other approaches in discriminating between native and alien DNA at genome optimized settings, network clustering did consistently better than other methods at parametric settings optimized on all test genomes together. Segment order-based clustering and hierarchical clustering outperformed other methods in alien DNA identification while preserving donor identity in the genomes. Our study highlights the strengths and weaknesses of different approaches and suggests how this can be leveraged to achieve a more robust deconstruction of bacterial chimerism.


Asunto(s)
Quimerismo , Genoma Bacteriano , Bacterias/genética , Análisis por Conglomerados , Genoma Bacteriano/genética , Genómica/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA