Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.460
Filtrar
1.
Bioinformatics ; 40(Suppl 2): ii165-ii173, 2024 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-39230701

RESUMEN

MOTIVATION: Functional profiling of metagenomic samples is essential to decipher the functional capabilities of microbial communities. Traditional and more widely used functional profilers in the context of metagenomics rely on aligning reads against a known reference database. However, aligning sequencing reads against a large and fast-growing database is computationally expensive. In general, k-mer-based sketching techniques have been successfully used in metagenomics to address this bottleneck, notably in taxonomic profiling. In this work, we describe leveraging FracMinHash (implemented in sourmash, a publicly available software), a k-mer-sketching algorithm, to obtain functional profiles of metagenome samples. RESULTS: We show how pieces of the sourmash software (and the resulting FracMinHash sketches) can be put together in a pipeline to functionally profile a metagenomic sample. We named our pipeline fmh-funprofiler. We report that the functional profiles obtained using this pipeline demonstrate comparable completeness and better purity compared to the profiles obtained using other alignment-based methods when applied to simulated metagenomic data. We also report that fmh-funprofiler is 39-99× faster in wall-clock time, and consumes up to 40-55× less memory. Coupled with the KEGG database, this method not only replicates fundamental biological insights but also highlights novel signals from the Human Microbiome Project datasets. AVAILABILITY AND IMPLEMENTATION: This fast and lightweight metagenomic functional profiler is freely available and can be accessed here: https://github.com/KoslickiLab/fmh-funprofiler. All scripts of the analyses we present in this manuscript can be found on GitHub.


Asunto(s)
Algoritmos , Metagenoma , Metagenómica , Programas Informáticos , Metagenómica/métodos , Metagenoma/genética , Humanos , Microbiota/genética , Bases de Datos Genéticas
2.
mSystems ; 9(9): e0024224, 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39158287

RESUMEN

Although long-read sequencing has enabled obtaining high-quality and complete genomes from metagenomes, many challenges still remain to completely decompose a metagenome into its constituent prokaryotic and viral genomes. This study focuses on decomposing an estuarine metagenome to obtain a more accurate estimate of microbial diversity. To achieve this, we developed a new bead-based DNA extraction method, a novel bin refinement method, and obtained 150 Gbp of Nanopore sequencing. We estimate that there are ~500 bacterial and archaeal species in our sample and obtained 68 high-quality bins (>90% complete, <5% contamination, ≤5 contigs, contig length of >100 kbp, and all ribosomal and tRNA genes). We also obtained many contigs of picoeukaryotes, environmental DNA of larger eukaryotes such as mammals, and complete mitochondrial and chloroplast genomes and detected ~40,000 viral populations. Our analysis indicates that there are only a few strains that comprise most of the species abundances. IMPORTANCE: Ocean and estuarine microbiomes play critical roles in global element cycling and ecosystem function. Despite the importance of these microbial communities, many species still have not been cultured in the lab. Environmental sequencing is the primary way the function and population dynamics of these communities can be studied. Long-read sequencing provides an avenue to overcome limitations of short-read technologies to obtain complete microbial genomes but comes with its own technical challenges, such as needed sequencing depth and obtaining high-quality DNA. We present here new sampling and bioinformatics methods to attempt decomposing an estuarine microbiome into its constituent genomes. Our results suggest there are only a few strains that comprise most of the species abundances from viruses to picoeukaryotes, and to fully decompose a metagenome of this diversity requires 1 Tbp of long-read sequencing. We anticipate that as long-read sequencing technologies continue to improve, less sequencing will be needed.


Asunto(s)
Estuarios , Metagenómica , Microbiota , Virus , Microbiota/genética , Metagenómica/métodos , San Francisco , Virus/genética , Virus/clasificación , Virus/aislamiento & purificación , Metagenoma/genética , Bacterias/genética , Bacterias/clasificación , Archaea/genética , Archaea/virología , Eucariontes/genética , Genoma Viral/genética
3.
mSystems ; 9(9): e0074624, 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39136455

RESUMEN

Characterization of microbial community metabolic output is crucial to understanding their functions. Construction of genome-scale metabolic models from metagenome-assembled genomes (MAG) has enabled prediction of metabolite production by microbial communities, yet little is known about their accuracy. Here, we examined the performance of two approaches for metabolite prediction from metagenomes, one that is MAG-guided and another that is taxonomic reference-guided. We applied both on shotgun metagenomics data from human and environmental samples, and validated findings in the human samples using untargeted metabolomics. We found that in human samples, where taxonomic profiling is optimized and reference genomes are readily available, when number of input taxa was normalized, the reference-guided approach predicted more metabolites than the MAG-guided approach. The two approaches showed significant overlap but each identified metabolites not predicted in the other. Pathway enrichment analyses identified significant differences in inferences derived from data based on the approach, highlighting the need for caution in interpretation. In environmental samples, when the number of input taxa was normalized, the reference-guided approach predicted more metabolites than the MAG-guided approach for total metabolites in both sample types and non-redundant metabolites in seawater samples. Nonetheless, as was observed for the human samples, the approaches overlapped substantially but also predicted metabolites not observed in the other. Our findings report on utility of a complementary input to genome-scale metabolic model construction that is less computationally intensive forgoing MAG assembly and refinement, and that can be applied on shallow shotgun sequencing where MAGs cannot be generated.IMPORTANCELittle is known about the accuracy of genome-scale metabolic models (GEMs) of microbial communities despite their influence on inferring community metabolic outputs and culture conditions. The performance of GEMs for metabolite prediction from metagenomes was assessed by applying two approaches on shotgun metagenomics data from human and environmental samples, and validating findings in the human samples using untargeted metabolomics. The performance of the approach was found to be dependent on sample type, but collectively, the reference-guided approach predicted more metabolites than the MAG-guided approach. Despite the differences, the predictions from the approaches overlapped substantially but each identified metabolites not predicted in the other. We found significant differences in biological inferences based on the approach, with some examples of uniquely enriched pathways in one group being invalidated when using the alternative approach, highlighting the need for caution in interpretation of GEMs.


Asunto(s)
Metabolómica , Metagenómica , Microbiota , Humanos , Metagenómica/métodos , Metabolómica/métodos , Microbiota/genética , Metagenoma/genética
4.
Comput Biol Med ; 180: 108852, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39137667

RESUMEN

BACKGROUND: Current methods for comparing metagenomes, derived from whole-genome sequencing reads, include top-down metrics or parametric models such as metagenome-diversity, and bottom-up, non-parametric, model-free machine learning approaches like Naïve Bayes for k-mer-profiling. However, both types are limited in their ability to effectively and comprehensively identify and catalogue unique or enriched metagenomic genes, a critical task in comparative metagenomics. This challenge is significant and complex due to its NP-hard nature, which means computational time grows exponentially, or even faster, with the problem size, rendering it impractical for even the fastest supercomputers without heuristic approximation algorithms. METHOD: In this study, we introduce a new framework, MC (Metagenome-Comparison), designed to exhaustively detect and catalogue unique or enriched metagenomic genes (MGs) and their derivatives, including metagenome functional gene clusters (MFGC), or more generally, the operational metagenomic unit (OMU) that can be considered the counterpart of the OTU (operational taxonomic unit) from amplicon sequencing reads. The MC is essentially a heuristic search algorithm guided by pairs of new metrics (termed MG-specificity or OMU-specificity, MG-specificity diversity or OMU-specificity diversity). It is further constrained by statistical significance (P-value) implemented as a pair of statistical tests. RESULTS: We evaluated the MC using large metagenomic datasets related to obesity, diabetes, and IBD, and found that the proportions of unique and enriched metagenomic genes ranged from 0.001% to 0.08 % and 0.08%-0.82 % respectively, and less than 10 % for the MFGC. CONCLUSION: The MC provides a robust method for comparing metagenomes at various scales, from baseline MGs to various function/pathway clusters of metagenomes, collectively termed OMUs.


Asunto(s)
Metagenoma , Metagenómica , Humanos , Metagenómica/métodos , Metagenoma/genética , Secuenciación Completa del Genoma/métodos , Algoritmos
5.
Nat Commun ; 15(1): 7536, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39214976

RESUMEN

Nucleocytoplasmic large DNA viruses (NCLDVs; also called giant viruses), constituting the phylum Nucleocytoviricota, can infect a wide range of eukaryotes and exchange genetic material with not only their hosts but also prokaryotes and phages. A few NCLDVs were reported to encode genes conferring resistance to beta­lactam, trimethoprim, or pyrimethamine, suggesting that they are potential vehicles for the transmission of antibiotic resistance genes (ARGs) in the biome. However, the incidence of ARGs across the phylum Nucleocytoviricota, their evolutionary characteristics, their dissemination potential, and their association with virulence factors remain unexplored. Here, we systematically investigated ARGs of 1416 NCLDV genomes including those of almost all currently available cultured isolates and high-quality metagenome-assembled genomes from diverse habitats across the globe. We reveal that 39.5% of them carry ARGs, which is approximately 37 times higher than that for phage genomes. A total of 12 ARG types are encoded by NCLDVs. Phylogenies of the three most abundant NCLDV-encoded ARGs hint that NCLDVs acquire ARGs from not only eukaryotes but also prokaryotes and phages. Two NCLDV-encoded trimethoprim resistance genes are demonstrated to confer trimethoprim resistance in Escherichia coli. The presence of ARGs in NCLDV genomes is significantly correlated with mobile genetic elements and virulence factors.


Asunto(s)
Genoma Viral , Virus Gigantes , Filogenia , Virus Gigantes/genética , Genoma Viral/genética , Farmacorresistencia Microbiana/genética , Bacteriófagos/genética , Bacteriófagos/aislamiento & purificación , Antibacterianos/farmacología , Metagenoma/genética , Transferencia de Gen Horizontal , Trimetoprim/farmacología , Farmacorresistencia Bacteriana/genética
6.
Nat Commun ; 15(1): 7563, 2024 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-39214983

RESUMEN

Small open reading frames (smORFs) shorter than 100 codons are widespread and perform essential roles in microorganisms, where they encode proteins active in several cell functions, including signal pathways, stress response, and antibacterial activities. However, the ecology, distribution and role of small proteins in the global microbiome remain unknown. Here, we construct a global microbial smORFs catalog (GMSC) derived from 63,410 publicly available metagenomes across 75 distinct habitats and 87,920 high-quality isolate genomes. GMSC contains 965 million non-redundant smORFs with comprehensive annotations. We find that archaea harbor more smORFs proportionally than bacteria. We moreover provide a tool called GMSC-mapper to identify and annotate small proteins from microbial (meta)genomes. Overall, this publicly-available resource demonstrates the immense and underexplored diversity of small proteins.


Asunto(s)
Archaea , Bacterias , Metagenoma , Microbiota , Sistemas de Lectura Abierta , Microbiota/genética , Sistemas de Lectura Abierta/genética , Bacterias/genética , Bacterias/clasificación , Bacterias/metabolismo , Metagenoma/genética , Archaea/genética , Archaea/metabolismo , Archaea/clasificación , Anotación de Secuencia Molecular , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo
7.
Nat Commun ; 15(1): 7551, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39215001

RESUMEN

Sewage metagenomics has risen to prominence in urban population surveillance of pathogens and antimicrobial resistance (AMR). Unknown species with similarity to known genomes cause database bias in reference-based metagenomics. To improve surveillance, we seek to recover sewage genomes and develop a quantification and correlation workflow for these genomes and AMR over time. We use longitudinal sewage sampling in seven treatment plants from five major European cities to explore the utility of catch-all sequencing of these population-level samples. Using metagenomic assembly methods, we recover 2332 metagenome-assembled genomes (MAGs) from prokaryotic species, 1334 of which were previously undescribed. These genomes account for ~69% of sequenced DNA and provide insight into sewage microbial dynamics. Rotterdam (Netherlands) and Copenhagen (Denmark) show strong seasonal microbial community shifts, while Bologna, Rome, (Italy) and Budapest (Hungary) have occasional blooms of Pseudomonas-dominated communities, accounting for up to ~95% of sample DNA. Seasonal shifts and blooms present challenges for effective sewage surveillance. We find that bacteria of known shared origin, like human gut microbiota, form communities, suggesting the potential for source-attributing novel species and their ARGs through network community analysis. This could significantly improve AMR tracking in urban environments.


Asunto(s)
Bacterias , Metagenoma , Metagenómica , Microbiota , Estaciones del Año , Aguas del Alcantarillado , Aguas del Alcantarillado/microbiología , Metagenómica/métodos , Humanos , Microbiota/genética , Bacterias/genética , Bacterias/clasificación , Bacterias/aislamiento & purificación , Metagenoma/genética , Europa (Continente)
8.
Artículo en Inglés | MEDLINE | ID: mdl-39160620

RESUMEN

Cold seeps in the deep sea are closely linked to energy exploration as well as global climate change. The alkane-dominated chemical energy-driven model makes cold seeps an oasis of deep-sea life, showcasing an unparalleled reservoir of microbial genetic diversity. Here, by analyzing 113 metagenomes collected from 14 global sites across 5 cold seep types, we present a comprehensive Cold Seep Microbiomic Database (CSMD) to archive the genomic and functional diversity of cold seep microbiomes. The CSMD includes over 49 million non-redundant genes and 3175 metagenome-assembled genomes, which represent 1895 species spanning 105 phyla. In addition, beta diversity analysis indicates that both the sampling site and cold seep type have a substantial impact on the prokaryotic microbiome community composition. Heterotrophic and anaerobic metabolisms are prevalent in microbial communities, accompanied by considerable mixotrophs and facultative anaerobes, highlighting the versatile metabolic potential in cold seeps. Furthermore, secondary metabolic gene cluster analysis indicates that at least 98.81% of the sequences potentially encode novel natural products, with ribosomally synthesized and post-translationally modified peptides being the predominant type widely distributed in archaea and bacteria. Overall, the CSMD represents a valuable resource that would enhance the understanding and utilization of global cold seep microbiomes.


Asunto(s)
Archaea , Metagenoma , Microbiota , Metagenoma/genética , Archaea/genética , Archaea/metabolismo , Archaea/clasificación , Microbiota/genética , Bacterias/genética , Bacterias/clasificación , Bacterias/metabolismo , Productos Biológicos/metabolismo , Frío , Filogenia , Agua de Mar/microbiología , Metagenómica/métodos , Biodiversidad
9.
PeerJ ; 12: e17805, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39099658

RESUMEN

Background: Tracking the spread of antibiotic resistant bacteria is critical to reduce global morbidity and mortality associated with human and animal infections. There is a need to understand the role that wild animals in maintenance and transfer of antibiotic resistance genes (ARGs). Methods: This study used metagenomics to identify and compare the abundance of bacterial species and ARGs detected in the gut microbiomes from sympatric humans and wild mouse lemurs in a forest-dominated, roadless region of Madagascar near Ranomafana National Park. We examined the contribution of human geographic location toward differences in ARG abundance and compared the genomic similarity of ARGs between host source microbiomes. Results: Alpha and beta diversity of species and ARGs between host sources were distinct but maintained a similar number of detectable ARG alleles. Humans were differentially more abundant for four distinct tetracycline resistance-associated genes compared to lemurs. There was no significant difference in human ARG diversity from different locations. Human and lemur microbiomes shared 14 distinct ARGs with highly conserved in nucleotide identity. Synteny of ARG-associated assemblies revealed a distinct multidrug-resistant gene cassette carrying dfrA1 and aadA1 present in human and lemur microbiomes without evidence of geographic overlap, suggesting that these resistance genes could be widespread in this ecosystem. Further investigation into intermediary processes that maintain drug-resistant bacteria in wildlife settings is needed.


Asunto(s)
Microbioma Gastrointestinal , Metagenoma , Animales , Madagascar , Humanos , Metagenoma/genética , Microbioma Gastrointestinal/genética , Simpatría , Población Rural , Metagenómica , Bacterias/genética , Bacterias/efectos de los fármacos , Farmacorresistencia Bacteriana/genética , Genes Bacterianos , Cheirogaleidae/genética , Cheirogaleidae/microbiología
10.
Nat Commun ; 15(1): 6789, 2024 Aug 08.
Artículo en Inglés | MEDLINE | ID: mdl-39117673

RESUMEN

Oil reservoirs, being one of the significant subsurface repositories of energy and carbon, host diverse microbial communities affecting energy production and carbon emissions. Viruses play crucial roles in the ecology of microbiomes, however, their distribution and ecological significance in oil reservoirs remain undetermined. Here, we assemble a catalogue encompassing viral and prokaryotic genomes sourced from oil reservoirs. The catalogue comprises 7229 prokaryotic genomes and 3,886 viral Operational Taxonomic Units (vOTUs) from 182 oil reservoir metagenomes. The results show that viruses are widely distributed in oil reservoirs, and 85% vOTUs in oil reservoir are detected in less than 10% of the samples, highlighting the heterogeneous nature of viral communities within oil reservoirs. Through combined microcosm enrichment experiments and bioinformatics analysis, we validate the ecological roles of viruses in regulating the community structure of sulfate reducing microorganisms, primarily through a virulent lifestyle. Taken together, this study uncovers a rich diversity of viruses and their ecological functions within oil reservoirs, offering a comprehensive understanding of the role of viral communities in the biogeochemical cycles of the deep biosphere.


Asunto(s)
Biodiversidad , Metagenoma , Yacimiento de Petróleo y Gas , Virus , Yacimiento de Petróleo y Gas/virología , Yacimiento de Petróleo y Gas/microbiología , Virus/genética , Virus/clasificación , Virus/aislamiento & purificación , Metagenoma/genética , Microbiota/genética , Genoma Viral/genética , Filogenia , Bacterias/genética , Bacterias/clasificación , Bacterias/aislamiento & purificación , Metagenómica
11.
BMC Bioinformatics ; 25(1): 266, 2024 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-39143554

RESUMEN

BACKGROUND: Construction of co-occurrence networks in metagenomic data often employs correlation to infer pairwise relationships between microbes. However, biological systems are complex and often display qualities non-linear in nature. Therefore, the reliance on correlation alone may overlook important relationships and fail to capture the full breadth of intricacies presented in underlying interaction networks. It is of interest to incorporate metrics that are not only robust in detecting linear relationships, but non-linear ones as well. RESULTS: In this paper, we explore the use of various mutual information (MI) estimation approaches for quantifying pairwise relationships in biological data and compare their performances against two traditional measures-Pearson's correlation coefficient, r, and Spearman's rank correlation coefficient, ρ. Metrics are tested on both simulated data designed to mimic pairwise relationships that may be found in ecological systems and real data from a previous study on C. diff infection. The results demonstrate that, in the case of asymmetric relationships, mutual information estimators can provide better detection ability than Pearson's or Spearman's correlation coefficients. Specifically, we find that these estimators have elevated performances in the detection of exploitative relationships, demonstrating the potential benefit of including them in future metagenomic studies. CONCLUSIONS: Mutual information (MI) can uncover complex pairwise relationships in biological data that may be missed by traditional measures of association. The inclusion of such relationships when constructing co-occurrence networks can result in a more comprehensive analysis than the use of correlation alone.


Asunto(s)
Metagenómica , Metagenómica/métodos , Algoritmos , Metagenoma/genética
12.
OMICS ; 28(8): 394-407, 2024 08.
Artículo en Inglés | MEDLINE | ID: mdl-39029911

RESUMEN

In the field of bioinformatics, amplicon sequencing of 16S rRNA genes has long been used to investigate community membership and taxonomic abundance in microbiome studies. As we can observe, shotgun metagenomics has become the dominant method in this field. This is largely owing to advancements in sequencing technology, which now allow for random sequencing of the entire genetic content of a microbiome. Furthermore, this method allows profiling both genes and the microbiome's membership. Although these methods have provided extensive insights into various microbiomes, they solely assess the existence of organisms or genes, without determining their active role within the microbiome. Microbiome scholarship now includes metatranscriptomics to decipher how a community of microorganisms responds to changing environmental conditions over a period of time. Metagenomic studies identify the microbes that make up a community but metatranscriptomics explores the diversity of active genes within that community, understanding their expression profile and observing how these genes respond to changes in environmental conditions. This expert review article offers a critical examination of the computational metatranscriptomics tools for studying the transcriptomes of microbial communities. First, we unpack the reasons behind the need for community transcriptomics. Second, we explore the prospects and challenges of metatranscriptomic workflows, starting with isolation and sequencing of the RNA community, then moving on to bioinformatics approaches for quantifying RNA features, and statistical techniques for detecting differential expression in a community. Finally, we discuss strengths and shortcomings in relation to other microbiome analysis approaches, pipelines, use cases and limitations, and contextualize metatranscriptomics as a tool for clinical metagenomics.


Asunto(s)
Biología Computacional , Metagenómica , Microbiota , Transcriptoma , Metagenómica/métodos , Microbiota/genética , Humanos , Biología Computacional/métodos , Transcriptoma/genética , Perfilación de la Expresión Génica/métodos , ARN Ribosómico 16S/genética , Metagenoma/genética
13.
mSystems ; 9(8): e0057324, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-38980052

RESUMEN

Metagenomic sequencing has advanced our understanding of biogeochemical processes by providing an unprecedented view into the microbial composition of different ecosystems. While the amount of metagenomic data has grown rapidly, simple-to-use methods to analyze and compare across studies have lagged behind. Thus, tools expressing the metabolic traits of a community are needed to broaden the utility of existing data. Gene abundance profiles are a relatively low-dimensional embedding of a metagenome's functional potential and are, thus, tractable for comparison across many samples. Here, we compare the abundance of KEGG Ortholog Groups (KOs) from 6,539 metagenomes from the Joint Genome Institute's Integrated Microbial Genomes and Metagenomes (JGI IMG/M) database. We find that samples cluster into terrestrial, aquatic, and anaerobic ecosystems with marker KOs reflecting adaptations to these environments. For instance, functional clusters were differentiated by the metabolism of antibiotics, photosynthesis, methanogenesis, and surprisingly GC content. Using this functional gene approach, we reveal the broad-scale patterns shaping microbial communities and demonstrate the utility of ortholog abundance profiles for representing a rapidly expanding body of metagenomic data. IMPORTANCE: Metagenomics, or the sequencing of DNA from complex microbiomes, provides a view into the microbial composition of different environments. Metagenome databases were created to compile sequencing data across studies, but it remains challenging to compare and gain insight from these large data sets. Consequently, there is a need to develop accessible approaches to extract knowledge across metagenomes. The abundance of different orthologs (i.e., genes that perform a similar function across species) provides a simplified representation of a metagenome's metabolic potential that can easily be compared with others. In this study, we cluster the ortholog abundance profiles of thousands of metagenomes from diverse environments and uncover the traits that distinguish them. This work provides a simple to use framework for functional comparison and advances our understanding of how the environment shapes microbial communities.


Asunto(s)
Metagenoma , Metagenómica , Metagenómica/métodos , Metagenoma/genética , Ecosistema , Análisis por Conglomerados , Microbiota/genética
14.
mSystems ; 9(8): e0021324, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-38980053

RESUMEN

Shotgun metagenomics allows comprehensive sampling of the genomic information of microbes in a given environment and is a tool of choice for studying complex microbial systems. Mapping sequencing reads against a set of reference or metagenome-assembled genomes is in principle a simple and powerful approach to define the species-level composition of the microbial community under investigation. However, despite the widespread use of this approach, there is no established way to properly interpret the alignment results, with arbitrary relative abundance thresholds being routinely used to discriminate between present and absent species. Such an approach can be affected by significant biases, especially in the identification of rare species. Therefore, it is important to develop new metrics to overcome these biases. Here, we present Metapresence, a new tool to perform reliable identification of the species in metagenomic samples based on the distribution of mapped reads on the reference genomes. The analysis is based on two metrics describing the breadth of coverage and the genomic distance between consecutive reads. We demonstrate the high precision and wide applicability of the tool using data from various synthetic communities, a real mock community, and the gut microbiome of healthy individuals and antibiotic-associated-diarrhea patients. Overall, our results suggest that the proposed approach has a robust performance in hard-to-analyze microbial communities containing contaminated or closely related genomes in low abundance.IMPORTANCEDespite the prevalent use of genome-centric alignment-based methods to characterize microbial community composition, there lacks a standardized approach for accurately identifying the species within a sample. Currently, arbitrary relative abundance thresholds are commonly employed for this purpose. However, due to the inherent complexity of genome structure and biases associated with genome-centric approaches, this practice tends to be imprecise. Notably, it introduces significant biases, particularly in the identification of rare species. The method presented here addresses these limitations and contributes significantly to overcoming inaccuracies in precisely defining community composition, especially when dealing with rare members.


Asunto(s)
Metagenoma , Metagenómica , Metagenómica/métodos , Humanos , Metagenoma/genética , Microbioma Gastrointestinal/genética , Genoma Bacteriano/genética , Programas Informáticos , Bacterias/genética , Bacterias/clasificación , Bacterias/aislamiento & purificación
15.
Nat Commun ; 15(1): 5734, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38977664

RESUMEN

Metagenomic sequencing has provided great advantages in the characterisation of microbiomes, but currently available analysis tools lack the ability to combine subspecies-level taxonomic resolution and accurate abundance estimation with functional profiling of assembled genomes. To define the microbiome and its associations with human health, improved tools are needed to enable comprehensive understanding of the microbial composition and elucidation of the phylogenetic and functional relationships between the microbes. Here, we present MAGinator, a freely available tool, tailored for profiling of shotgun metagenomics datasets. MAGinator provides de novo identification of subspecies-level microbes and accurate abundance estimates of metagenome-assembled genomes (MAGs). MAGinator utilises the information from both gene- and contig-based methods yielding insight into both taxonomic profiles and the origin of genes and genetic content, used for inference of functional content of each sample by host organism. Additionally, MAGinator facilitates the reconstruction of phylogenetic relationships between the MAGs, providing a framework to identify clade-level differences.


Asunto(s)
Metagenoma , Metagenómica , Microbiota , Filogenia , Metagenómica/métodos , Metagenoma/genética , Humanos , Microbiota/genética , Programas Informáticos , Bacterias/genética , Bacterias/clasificación , Genoma Bacteriano/genética
16.
Nat Commun ; 15(1): 6346, 2024 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-39068184

RESUMEN

Viruses are core components of the human microbiome, impacting health through interactions with gut bacteria and the immune system. Most human microbiome viruses are bacteriophages, which exclusively infect bacteria. Until recently, most gut virome studies focused on low taxonomic resolution (e.g., viral operational taxonomic units), hampering population-level analyses. We previously identified an expansive and widespread bacteriophage lineage in inhabitants of Amsterdam, the Netherlands. Here, we study their biodiversity and evolution in various human populations. Based on a phylogeny using sequences from six viral genome databases, we propose the Candidatus order Heliusvirales. We identify heliusviruses in 82% of 5441 individuals across 39 studies, and in nine metagenomes from humans that lived in Europe and North America between 1000 and 5000 years ago. We show that a large lineage started to diversify when Homo sapiens first appeared some 300,000 years ago. Ancient peoples and modern hunter-gatherers have distinct Ca. Heliusvirales populations with lower richness than modern urbanized people. Urbanized people suffering from type 1 and type 2 diabetes, as well as inflammatory bowel disease, have higher Ca. Heliusvirales richness than healthy controls. We thus conclude that these ancient core members of the human gut virome have thrived with increasingly westernized lifestyles.


Asunto(s)
Bacteriófagos , Microbioma Gastrointestinal , Filogenia , Humanos , Bacteriófagos/genética , Bacteriófagos/aislamiento & purificación , Bacteriófagos/clasificación , Microbioma Gastrointestinal/genética , Genoma Viral/genética , Metagenoma/genética , Viroma/genética , Enfermedades Inflamatorias del Intestino/virología , Biodiversidad , Diabetes Mellitus Tipo 2/virología , Femenino , Masculino , Europa (Continente) , Países Bajos , Adulto
17.
BMC Bioinformatics ; 25(1): 237, 2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-38997633

RESUMEN

BACKGROUND: With the emergence of Oxford Nanopore technology, now the on-site sequencing of 16S rRNA from environments is available. Due to the error level and structure, the analysis of such data demands some database of reference sequences. However, many taxa from complex and diverse environments, have poor representation in publicly available databases. In this paper, we propose the METASEED pipeline for the reconstruction of full-length 16S sequences from such environments, in order to improve the reference for the subsequent use of on-site sequencing. RESULTS: We show that combining high-precision short-read sequencing of both 16S and full metagenome from the same samples allow us to reconstruct high-quality 16S sequences from the more abundant taxa. A significant novelty is the carefully designed collection of metagenome reads that matches the 16S amplicons, based on a combination of uniqueness and abundance. Compared to alternative approaches this produces superior results. CONCLUSION: Our pipeline will facilitate numerous studies associated with various unknown microorganisms, thus allowing the comprehension of the diverse environments. The pipeline is a potential tool in generating a full length 16S rRNA gene database for any environment.


Asunto(s)
Metagenoma , ARN Ribosómico 16S , ARN Ribosómico 16S/genética , Metagenoma/genética , Análisis de Secuencia de ADN/métodos , Bases de Datos Genéticas
18.
BMC Bioinformatics ; 25(1): 241, 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39014300

RESUMEN

BACKGROUND: Using next-generation sequencing technologies, scientists can sequence complex microbial communities directly from the environment. Significant insights into the structure, diversity, and ecology of microbial communities have resulted from the study of metagenomics. The assembly of reads into longer contigs, which are then binned into groups of contigs that correspond to different species in the metagenomic sample, is a crucial step in the analysis of metagenomics. It is necessary to organize these contigs into operational taxonomic units (OTUs) for further taxonomic profiling and functional analysis. For binning, which is synonymous with the clustering of OTUs, the tetra-nucleotide frequency (TNF) is typically utilized as a compositional feature for each OTU. RESULTS: In this paper, we present AFIT, a new l-mer statistic vector for each contig, and AFITBin, a novel method for metagenomic binning based on AFIT and a matrix factorization method. To evaluate the performance of the AFIT vector, the t-SNE algorithm is used to compare species clustering based on AFIT and TNF information. In addition, the efficacy of AFITBin is demonstrated on both simulated and real datasets in comparison to state-of-the-art binning methods such as MetaBAT 2, MaxBin 2.0, CONCOT, MetaCon, SolidBin, BusyBee Web, and MetaBinner. To further analyze the performance of the purposed AFIT vector, we compare the barcodes of the AFIT vector and the TNF vector. CONCLUSION: The results demonstrate that AFITBin shows superior performance in taxonomic identification compared to existing methods, leveraging the AFIT vector for improved results in metagenomic binning. This approach holds promise for advancing the analysis of metagenomic data, providing more reliable insights into microbial community composition and function. AVAILABILITY: A python package is available at: https://github.com/SayehSobhani/AFITBin .


Asunto(s)
Algoritmos , Metagenómica , Metagenómica/métodos , Nucleótidos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Microbiota/genética , Análisis de Secuencia de ADN/métodos , Análisis por Conglomerados , Mapeo Contig/métodos , Metagenoma/genética
19.
Mol Genet Genomics ; 299(1): 73, 2024 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-39066857

RESUMEN

Exploring the intricate relationships between plants and their resident microorganisms is crucial not only for developing new methods to improve disease resistance and crop yields but also for understanding their co-evolutionary dynamics. Our research delves into the role of the phyllosphere-associated microbiome, especially Actinomycetota species, in enhancing pathogen resistance in Theobroma grandiflorum, or cupuassu, an agriculturally valuable Amazonian fruit tree vulnerable to witches' broom disease caused by Moniliophthora perniciosa. While breeding resistant cupuassu genotypes is a possible solution, the capacity of the Actinomycetota phylum to produce beneficial metabolites offers an alternative approach yet to be explored in this context. Utilizing advanced long-read sequencing and metagenomic analysis, we examined Actinomycetota from the phyllosphere of a disease-resistant cupuassu genotype, identifying 11 Metagenome-Assembled Genomes across eight genera. Our comparative genomic analysis uncovered 54 Biosynthetic Gene Clusters related to antitumor, antimicrobial, and plant growth-promoting activities, alongside cutinases and type VII secretion system-associated genes. These results indicate the potential of phyllosphere-associated Actinomycetota in cupuassu for inducing resistance or antagonism against pathogens. By integrating our genomic discoveries with the existing knowledge of cupuassu's defense mechanisms, we developed a model hypothesizing the synergistic or antagonistic interactions between plant and identified Actinomycetota during plant-pathogen interactions. This model offers a framework for understanding the intricate dynamics of microbial influence on plant health. In conclusion, this study underscores the significance of the phyllosphere microbiome, particularly Actinomycetota, in the broader context of harnessing microbial interactions for plant health. These findings offer valuable insights for enhancing agricultural productivity and sustainability.


Asunto(s)
Enfermedades de las Plantas , Hojas de la Planta , Hojas de la Planta/microbiología , Hojas de la Planta/genética , Enfermedades de las Plantas/microbiología , Enfermedades de las Plantas/genética , Resistencia a la Enfermedad/genética , Microbiota/genética , Ecosistema , Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Metagenómica/métodos , Metagenoma/genética , Filogenia , Brassicaceae/microbiología , Brassicaceae/genética
20.
Genes (Basel) ; 15(7)2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39062701

RESUMEN

Acute febrile illness (AFI) and severe neurological disorders (SNDs) often present diagnostic challenges due to their potential origins from a wide range of infectious agents. Nanopore metagenomics is emerging as a powerful tool for identifying the microorganisms potentially responsible for these undiagnosed clinical cases. In this study, we aim to shed light on the etiological agents underlying AFI and SND cases that conventional diagnostic methods have not been able to fully elucidate. Our approach involved analyzing samples from fourteen hospitalized patients using a comprehensive nanopore metagenomic approach. This process included RNA extraction and enrichment using the SMART-9N protocol, followed by nanopore sequencing. Subsequent steps involved quality control, host DNA/cDNA removal, de novo genome assembly, and taxonomic classification. Our findings in AFI cases revealed a spectrum of disease-associated microbes, including Escherichia coli, Streptococcus sp., Human Immunodeficiency Virus 1 (Subtype B), and Human Pegivirus. Similarly, SND cases revealed the presence of pathogens such as Escherichia coli, Clostridium sp., and Dengue virus type 2 (Genotype-II lineage). This study employed a metagenomic analysis method, demonstrating its efficiency and adaptability in pathogen identification. Our investigation successfully identified pathogens likely associated with AFI and SNDs, underscoring the feasibility of retrieving near-complete genomes from RNA viruses. These findings offer promising prospects for advancing our understanding and control of infectious diseases, by facilitating detailed genomic analysis which is critical for developing targeted interventions and therapeutic strategies.


Asunto(s)
Metagenómica , Secuenciación de Nanoporos , Humanos , Metagenómica/métodos , Secuenciación de Nanoporos/métodos , Masculino , Femenino , Enfermedades del Sistema Nervioso/microbiología , Enfermedades del Sistema Nervioso/genética , Enfermedades del Sistema Nervioso/virología , Adulto , Persona de Mediana Edad , Nanoporos , Anciano , Metagenoma/genética , Fiebre/microbiología , Fiebre/virología , Escherichia coli/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA