Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Mol Biol Evol ; 41(8)2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39137184

RESUMEN

Segmented RNA viruses are a complex group of RNA viruses with multisegment genomes. Reconstructing complete segmented viruses is crucial for advancing our understanding of viral diversity, evolution, and public health impact. Using metatranscriptomic data to identify known and novel segmented viruses has sped up the survey of segmented viruses in various ecosystems. However, the high genetic diversity and the difficulty in binning complete segmented genomes present significant challenges in segmented virus reconstruction. Current virus detection tools are primarily used to identify nonsegmented viral genomes. This study presents SegVir, a novel tool designed to identify segmented RNA viruses and reconstruct their complete genomes from complex metatranscriptomes. SegVir leverages both close and remote homology searches to accurately detect conserved and divergent viral segments. Additionally, we introduce a new method that can evaluate the genome completeness and conservation based on gene content. Our evaluations on simulated datasets demonstrate SegVir's superior sensitivity and precision compared to existing tools. Moreover, in experiments using real data, we identified some virus segments missing in the NCBI database, underscoring SegVir's potential to enhance viral metagenome analysis. The source code and supporting data of SegVir are available via https://github.com/HubertTang/SegVir.


Asunto(s)
Genoma Viral , Virus ARN , Virus ARN/genética , Transcriptoma , ARN Viral/genética , Programas Informáticos , Metagenoma , Metagenómica/métodos
2.
Mol Ecol Resour ; 24(2): e13904, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37994269

RESUMEN

Several computational frameworks and workflows that recover genomes from prokaryotes, eukaryotes and viruses from metagenomes exist. Yet, it is difficult for scientists with little bioinformatics experience to evaluate quality, annotate genes, dereplicate, assign taxonomy and calculate relative abundance and coverage of genomes belonging to different domains. MuDoGeR is a user-friendly tool tailored for those familiar with Unix command-line environment that makes it easy to recover genomes of prokaryotes, eukaryotes and viruses from metagenomes, either alone or in combination. We tested MuDoGeR using 24 individual-isolated genomes and 574 metagenomes, demonstrating the applicability for a few samples and high throughput. While MuDoGeR can recover eukaryotic viral sequences, its characterization is predominantly skewed towards bacterial and archaeal viruses, reflecting the field's current state. However, acting as a dynamic wrapper, the MuDoGeR is designed to constantly incorporate updates and integrate new tools, ensuring its ongoing relevance in the rapidly evolving field. MuDoGeR is open-source software available at https://github.com/mdsufz/MuDoGeR. Additionally, MuDoGeR is also available as a Singularity container.


Asunto(s)
Metagenoma , Virus , Metagenómica , Programas Informáticos , Bacterias/genética , Filogenia , Virus/genética
3.
Biotechnol Adv ; 69: 108261, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37741424

RESUMEN

Production of food-related products using microorganisms in an environmentally friendly manner is a crucial solution to global food safety and environmental pollution issues. Traditional microbial modification methods rely on artificial selection or natural mutations, which require time for repeated screening and reproduction, leading to unstable results. Therefore, it is imperative to develop rapid, efficient, and precise microbial modification technologies. This review summarizes recent advances in the construction of gene editing and metabolic regulation toolkits based on the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (CRISPR-Cas) systems and their applications in reconstructing food microorganism metabolic networks. The development and application of gene editing toolkits from single-site gene editing to multi-site and genome-scale gene editing was also introduced. Moreover, it presented a detailed introduction to CRISPR interference, CRISPR activation, and logic circuit toolkits for metabolic network regulation. Moreover, the current challenges and future prospects for developing CRISPR genetic toolkits were also discussed.


Asunto(s)
Sistemas CRISPR-Cas , Edición Génica , Sistemas CRISPR-Cas/genética , Edición Génica/métodos , Alimentos
4.
BMC Genomics ; 24(1): 401, 2023 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-37460975

RESUMEN

BACKGROUND: Bacteria of the Borrelia burgdorferi sensu lato (s.l.) complex can cause Lyme borreliosis. Different B. burgdorferi s.l. genospecies vary in their host and vector associations and human pathogenicity but the genetic basis for these adaptations is unresolved and requires completed and reliable genomes for comparative analyses. The de novo assembly of a complete Borrelia genome is challenging due to the high levels of complexity, represented by a high number of circular and linear plasmids that are dynamic, showing mosaic structure and sequence homology. Previous work demonstrated that even advanced approaches, such as a combination of short-read and long-read data, might lead to incomplete plasmid reconstruction. Here, using recently developed high-fidelity (HiFi) PacBio sequencing, we explored strategies to obtain gap-free, complete and high quality Borrelia genome assemblies. Optimizing genome assembly, quality control and refinement steps, we critically appraised existing techniques to create a workflow that lead to improved genome reconstruction. RESULTS: Despite the latest available technologies, stand-alone sequencing and assembly methods are insufficient for the generation of complete and high quality Borrelia genome assemblies. We developed a workflow pipeline for the de novo genome assembly for Borrelia using several types of sequence data and incorporating multiple assemblers to recover the complete genome including both circular and linear plasmid sequences. CONCLUSION: Our study demonstrates that, with HiFi data and an ensemble reconstruction pipeline with refinement steps, chromosomal and plasmid sequences can be fully resolved, even for complex genomes such as Borrelia. The presented pipeline may be of interest for the assembly of further complex microbial genomes.


Asunto(s)
Grupo Borrelia Burgdorferi , Borrelia burgdorferi , Borrelia , Enfermedad de Lyme , Humanos , Borrelia/genética , Genoma Bacteriano , Filogenia , Borrelia burgdorferi/genética , Enfermedad de Lyme/microbiología , Grupo Borrelia Burgdorferi/genética
5.
F1000Res ; 12: 1091, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38716230

RESUMEN

Background: Accurate genome sequences form the basis for genomic surveillance programs, the added value of which was impressively demonstrated during the COVID-19 pandemic by tracing transmission chains, discovering new viral lineages and mutations, and assessing them for infectiousness and resistance to available treatments. Amplicon strategies employing Illumina sequencing have become widely established for variant detection and reference-based reconstruction of SARS-CoV-2 genomes, and are routine bioinformatics tasks. Yet, specific challenges arise when analyzing amplicon data, for example, when crucial and even lineage-determining mutations occur near primer sites. Methods: We present CoVpipe2, a bioinformatics workflow developed at the Public Health Institute of Germany to reconstruct SARS-CoV-2 genomes based on short-read sequencing data accurately. The decisive factor here is the reliable, accurate, and rapid reconstruction of genomes, considering the specifics of the used sequencing protocol. Besides fundamental tasks like quality control, mapping, variant calling, and consensus generation, we also implemented additional features to ease the detection of mixed samples and recombinants. Results: We highlight common pitfalls in primer clipping, detecting heterozygote variants, and dealing with low-coverage regions and deletions. We introduce CoVpipe2 to address the above challenges and have compared and successfully validated the pipeline against selected publicly available benchmark datasets. CoVpipe2 features high usability, reproducibility, and a modular design that specifically addresses the characteristics of short-read amplicon protocols but can also be used for whole-genome short-read sequencing data. Conclusions: CoVpipe2 has seen multiple improvement cycles and is continuously maintained alongside frequently updated primer schemes and new developments in the scientific community. Our pipeline is easy to set up and use and can serve as a blueprint for other pathogens in the future due to its flexibility and modularity, providing a long-term perspective for continuous support. CoVpipe2 is written in Nextflow and is freely accessible from \href{https://github.com/rki-mf1/CoVpipe2}{github.com/rki-mf1/CoVpipe2} under the GPL3 license.

6.
Microbiome ; 10(1): 209, 2022 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-36457010

RESUMEN

BACKGROUND: The accurate and comprehensive analyses of genome-resolved metagenomics largely depend on the reconstruction of reference-quality (complete and high-quality) genomes from diverse microbiomes. Closing gaps in draft genomes have been approaching with the inclusion of Nanopore long reads; however, genome quality improvement requires extensive and time-consuming high-accuracy short-read polishing. RESULTS: Here, we introduce NanoPhase, an open-source tool to reconstruct reference-quality genomes from complex metagenomes using only Nanopore long reads. Using Kit 9 and Q20+ chemistries, we first evaluated the feasibility of NanoPhase using a ZymoBIOMICS gut microbiome standard (including 21 strains), then sequenced the complex activated sludge microbiome and reconstructed 275 MAGs with median completeness of ~ 90%. As a result, NanoPhase improved the MAG contiguity (median MAG N50: 735 Kb, 44-86X compared to conventional short-read-based methods) while maintaining high accuracy, allowing for a full and accurate investigation of target microbiomes. Additionally, leveraging these high-contiguity reference-quality genomes, we identified 165 prophages within 111 MAGs, with 5 as active prophages, indicating the prophage was a neglected source of genetic diversity within microbial populations and influencer in shaping microbial composition in the activated sludge microbiome. CONCLUSIONS: Our results demonstrated that NanoPhase enables reference-quality genome reconstruction from complex metagenomes directly using only Nanopore long reads. Furthermore, besides the 16S rRNA genes and biosynthetic gene clusters, the generated high-accuracy and high-contiguity MAGs improved the host identification of critical mobile genetic elements, e.g., prophage, serving as a genomic blueprint to investigate the microbial potential and ecology in the activated sludge ecosystem. Video Abstract.


Asunto(s)
Microbiota , Nanoporos , Metagenoma/genética , Metagenómica , ARN Ribosómico 16S/genética , Aguas del Alcantarillado , Microbiota/genética , Profagos
7.
Proc Natl Acad Sci U S A ; 119(40): e2209139119, 2022 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-36161960

RESUMEN

Decrypting the rearrangements that drive mammalian chromosome evolution is critical to understanding the molecular bases of speciation, adaptation, and disease susceptibility. Using 8 scaffolded and 26 chromosome-scale genome assemblies representing 23/26 mammal orders, we computationally reconstructed ancestral karyotypes and syntenic relationships at 16 nodes along the mammalian phylogeny. Three different reference genomes (human, sloth, and cattle) representing phylogenetically distinct mammalian superorders were used to assess reference bias in the reconstructed ancestral karyotypes and to expand the number of clades with reconstructed genomes. The mammalian ancestor likely had 19 pairs of autosomes, with nine of the smallest chromosomes shared with the common ancestor of all amniotes (three still conserved in extant mammals), demonstrating a striking conservation of synteny for ∼320 My of vertebrate evolution. The numbers and types of chromosome rearrangements were classified for transitions between the ancestral mammalian karyotype, descendent ancestors, and extant species. For example, 94 inversions, 16 fissions, and 14 fusions that occurred over 53 My differentiated the therian from the descendent eutherian ancestor. The highest breakpoint rate was observed between the mammalian and therian ancestors (3.9 breakpoints/My). Reconstructed mammalian ancestor chromosomes were found to have distinct evolutionary histories reflected in their rates and types of rearrangements. The distributions of genes, repetitive elements, topologically associating domains, and actively transcribed regions in multispecies homologous synteny blocks and evolutionary breakpoint regions indicate that purifying selection acted over millions of years of vertebrate evolution to maintain syntenic relationships of developmentally important genes and regulatory landscapes of gene-dense chromosomes.


Asunto(s)
Evolución Molecular , Cariotipo , Mamíferos , Sintenía , Animales , Bovinos/genética , Cromosomas de los Mamíferos/genética , Euterios/genética , Humanos , Mamíferos/genética , Filogenia , Perezosos/genética , Sintenía/genética
8.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35136954

RESUMEN

Shotgun sequencing is routinely employed to study bacteria in microbial communities. With the vast amount of shotgun sequencing reads generated in a metagenomic project, it is crucial to determine the microbial composition at the strain level. This study investigated 20 computational tools that attempt to infer bacterial strain genomes from shotgun reads. For the first time, we discussed the methodology behind these tools. We also systematically evaluated six novel-strain-targeting tools on the same datasets and found that BHap, mixtureS and StrainFinder performed better than other tools. Because the performance of the best tools is still suboptimal, we discussed future directions that may address the limitations.


Asunto(s)
Metagenómica , Microbiota , Bacterias/genética , Genoma Bacteriano , Metagenoma , Metagenómica/métodos , Análisis de Secuencia de ADN/métodos
9.
Biostatistics ; 23(2): 626-642, 2022 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33221831

RESUMEN

Three-dimensional (3D) genome spatial organization is critical for numerous cellular processes, including transcription, while certain conformation-driven structural alterations are frequently oncogenic. Genome architecture had been notoriously difficult to elucidate, but the advent of the suite of chromatin conformation capture assays, notably Hi-C, has transformed understanding of chromatin structure and provided downstream biological insights. Although many findings have flowed from direct analysis of the pairwise proximity data produced by these assays, there is added value in generating corresponding 3D reconstructions deriving from superposing genomic features on the reconstruction. Accordingly, many methods for inferring 3D architecture from proximity data have been advanced. However, none of these approaches exploit the fact that single chromosome solutions constitute a one-dimensional (1D) curve in 3D. Rather, this aspect has either been addressed by imposition of constraints, which is both computationally burdensome and cell type specific, or ignored with contiguity imposed after the fact. Here, we target finding a 1D curve by extending principal curve methodology to the metric scaling problem. We illustrate how this approach yields a sequence of candidate solutions, indexed by an underlying smoothness or degrees-of-freedom parameter, and propose methods for selection from this sequence. We apply the methodology to Hi-C data obtained on IMR90 cells and so are positioned to evaluate reconstruction accuracy by referencing orthogonal imaging data. The results indicate the utility and reproducibility of our principal curve approach in the face of underlying structural variation.


Asunto(s)
Cromatina , Genoma , Cromatina/genética , Cromosomas , Genómica/métodos , Humanos , Reproducibilidad de los Resultados
10.
Gigascience ; 122022 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-36807539

RESUMEN

BACKGROUND: Musa beccarii (Musaceae) is a banana species native to Borneo, sometimes grown as an ornamental plant. The basic chromosome number of Musa species is x = 7, 10, or 11; however, M. beccarii has a basic chromosome number of x = 9 (2n = 2x = 18), which is the same basic chromosome number of species in the sister genera Ensete and Musella. Musa beccarii is in the section Callimusa, which is sister to the section Musa. We generated a high-quality chromosome-scale genome assembly of M. beccarii to better understand the evolution and diversity of genomes within the family Musaceae. FINDINGS: The M. beccarii genome was assembled by long-read and Hi-C sequencing, and genes were annotated using both long Iso-seq and short RNA-seq reads. The size of M. beccarii was the largest among all known Musaceae assemblies (∼570 Mbp) due to the expansion of transposable elements and increased 45S ribosomal DNA sites. By synteny analysis, we detected extensive genome-wide chromosome fusions and fissions between M. beccarii and the other Musa and Ensete species, far beyond those expected from differences in chromosome number. Within Musaceae, M. beccarii showed a reduced number of terpenoid synthase genes, which are related to chemical defense, and enrichment in lipid metabolism genes linked to the physical defense of the cell wall. Furthermore, type III polyketide synthase was the most abundant biosynthetic gene cluster (BGC) in M. beccarii. BGCs were not conserved in Musaceae genomes. CONCLUSIONS: The genome assembly of M. beccarii is the first chromosome-scale genome assembly in the Callimusa section in Musa, which provides an important genetic resource that aids our understanding of the evolution of Musaceae genomes and enhances our knowledge of the pangenome.


Asunto(s)
Musa , Musaceae , Musa/genética , Musaceae/genética , Genoma de Planta , Cromosomas , ADN Ribosómico , Filogenia
11.
Proc Priv Enhanc Technol ; 2021(3): 28-48, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34746296

RESUMEN

Sharing genome data in a privacy-preserving way stands as a major bottleneck in front of the scientific progress promised by the big data era in genomics. A community-driven protocol named genomic data-sharing beacon protocol has been widely adopted for sharing genomic data. The system aims to provide a secure, easy to implement, and standardized interface for data sharing by only allowing yes/no queries on the presence of specific alleles in the dataset. However, beacon protocol was recently shown to be vulnerable against membership inference attacks. In this paper, we show that privacy threats against genomic data sharing beacons are not limited to membership inference. We identify and analyze a novel vulnerability of genomic data-sharing beacons: genome reconstruction. We show that it is possible to successfully reconstruct a substantial part of the genome of a victim when the attacker knows the victim has been added to the beacon in a recent update. In particular, we show how an attacker can use the inherent correlations in the genome and clustering techniques to run such an attack in an efficient and accurate way. We also show that even if multiple individuals are added to the beacon during the same update, it is possible to identify the victim's genome with high confidence using traits that are easily accessible by the attacker (e.g., eye color or hair type). Moreover, we show how a reconstructed genome using a beacon that is not associated with a sensitive phenotype can be used for membership inference attacks to beacons with sensitive phenotypes (e.g., HIV+). The outcome of this work will guide beacon operators on when and how to update the content of the beacon and help them (along with the beacon participants) make informed decisions.

12.
J Comput Biol ; 28(11): 1156-1179, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34783601

RESUMEN

Recurrent whole genome duplication and the ensuing loss of redundant genes-fractionation-complicate efforts to reconstruct the gene orders and chromosomes of the ancestors associated with the nodes of a phylogeny. Loss of genes disrupts the gene adjacencies key to current techniques. With our RACCROCHE pipeline, instead of starting with the inference of short ancestral segments, we suggest delaying the choice of gene adjacencies while we accumulate many more syntenically validated generalized (gapped) adjacencies. We obtain longer ancestral contigs using maximum weight matching (MWM). Similarly, we do not construct chromosomes by successively piecing together contigs into larger segments, but rather compile counts of pairwise contig co-occurrences on the set of extant genomes and use these to cluster the contigs. Chromosome-level contig assemblies for a monoploid genome emerge naturally at each node of the phylogeny and the contigs then can be ordered along the chromosome. Sampling alternative MWM solutions, visualizing heat maps, and applying gap statistics allow us to estimate the number of chromosomes in the reconstruction. We introduce several measures of quality: length of contigs, continuity of contig structure on successive ancestors, coverage of the extant genome by the reconstruction, and rearrangement relations among the inferred chromosomes. The reconstructed ancestors are visualized by painting the ancestral projections on the descendant genomes. We submit genomes drawn from a broad range of monocot orders to our pipeline, confirming the tetraploidization event "tau" in the stem lineage between the alismatids and the lilioids. We show additional applications to the Solanaceae and to four Brassica genomes, producing evidence about the monoploid ancestor in each case.


Asunto(s)
Biología Computacional/métodos , Duplicación de Gen , Magnoliopsida/clasificación , Algoritmos , Evolución Molecular , Orden Génico , Genoma de Planta , Magnoliopsida/genética , Filogenia
13.
Ecol Evol ; 10(23): 12700-12709, 2020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-33304488

RESUMEN

Paleogenomics is the nascent discipline concerned with sequencing and analysis of genome-scale information from historic, ancient, and even extinct samples. While once inconceivable due to the challenges of DNA damage, contamination, and the technical limitations of PCR-based Sanger sequencing, following the dawn of the second-generation sequencing revolution, it has rapidly become a reality. However, a significant challenge facing ancient DNA studies on extinct species is the lack of closely related reference genomes against which to map the sequencing reads from ancient samples. Although bioinformatic efforts to improve the assemblies have focused mainly in mapping algorithms, in this article we explore the potential of an alternative approach, namely using reconstructed ancestral genome as reference for mapping DNA sequences of ancient samples. Specifically, we present a preliminary proof of concept for a general framework and demonstrate how under certain evolutionary divergence thresholds, considerable mapping improvements can be easily obtained.

14.
Front Genet ; 11: 516269, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33101371

RESUMEN

PacBio long reads sequencing presents several potential advantages for DNA assembly, including being able to provide more complete gene profiling of metagenomic samples. However, lower single-pass accuracy can make gene discovery and assembly for low-abundance organisms difficult. To evaluate the application and performance of PacBio long reads and Illumina HiSeq short reads in metagenomic analyses, we directly compared various assemblies involving PacBio and Illumina sequencing reads based on two anaerobic digestion microbiome samples from a biogas fermenter. Using a PacBio platform, 1.58 million long reads (19.6 Gb) were produced with an average length of 7,604 bp. Using an Illumina HiSeq platform, 151.2 million read pairs (45.4 Gb) were produced. Hybrid assemblies using PacBio long reads and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length, contig N50 size, and number of large contigs. Interestingly, depth-based hybrid assemblies generated a higher percentage of complete genes (98.86%) compared to those based on HiSeq contigs only (40.29%), because the PacBio reads were long enough to cover many repeating short elements and capture multiple genes in a single read. Additionally, the incorporation of PacBio long reads led to considerable advantages regarding reducing contig numbers and increasing the completeness of the genome reconstruction, which was poorly assembled and binned when using HiSeq data alone. From this comparison of PacBio long reads with Illumina HiSeq short reads related to complex microbiome samples, we conclude that PacBio long reads can produce longer contigs, more complete genes, and better genome binning, thereby offering more information about metagenomic samples.

15.
Genes (Basel) ; 11(8)2020 07 24.
Artículo en Inglés | MEDLINE | ID: mdl-32722275

RESUMEN

The Solanum pennellii introgression lines (ILs) have been exploited to map quantitative trait loci (QTLs) and identify favorable alleles that could improve fruit quality traits in tomato varieties. Over the past few years, ILs exhibiting increased content of ascorbic acid in the fruit have been selected, among which the sub-line R182. The aims of this work were to identify the genes of the wild donor S. pennellii harbored by the sub-line and to detect genes controlling ascorbic acid accumulation by using genomics tools. A Genotyping-By-Sequencing (GBS) approach confirmed that no wild introgressions were present in the sub-line besides one region on chromosome 7. By using a dense single nucleotide polymorphism (SNP) map obtained by RNA sequencing (RNA-Seq), the wild region of the sub-line was finely identified; thus, defining 39 wild genes that replaced 33 genes of the ILs genetic background (cv. M82). The differentially expressed genes mapping in the region and the variants detected among the cultivated and the wild alleles evidenced the potential role of the novel genes present in the wild region. Interestingly, one upregulated gene, annotated as a major facilitator superfamily protein, showed a novel structure in R182, with respect to the parental lines. These genes will be further investigated using gene editing strategies.


Asunto(s)
Ácido Ascórbico/metabolismo , Frutas/metabolismo , Proteínas de Plantas/metabolismo , Sitios de Carácter Cuantitativo , Solanum lycopersicum/genética , Cromosomas de las Plantas/genética , Frutas/genética , Frutas/crecimiento & desarrollo , Genómica , Solanum lycopersicum/crecimiento & desarrollo , Solanum lycopersicum/metabolismo , Fenotipo , Proteínas de Plantas/genética
16.
Front Microbiol ; 11: 892, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32499766

RESUMEN

Dietary emulsifiers are widely used in industrially processed foods, although the effects of these food additives on human gut microbiota are not well studied. Here, we investigated the effects of five different emulsifiers [glycerol monoacetate, glycerol monostearate, glycerol monooleate, propylene glycol monostearate, and sodium stearoyl lactylate (SSL)] on fecal microbiota in vitro. We found that 0.025% (w/v) of SSL reduced the relative abundance of the bacterial class Clostridia and others. The relative abundance of the families Clostridiaceae, Lachnospiraceae, and Ruminococcaceae was substantially reduced whereas that of Bacteroidaceae and Enterobacteriaceae was increased. Given the marked impact of SSL on Clostridia, we used genome reconstruction to predict community-wide production of short-chain fatty acids, which were experimentally assessed by GC-MS analysis. SSL significantly reduced concentrations of butyrate, and increased concentrations of propionate compared to control cultures. The presence of SSL increased lipopolysaccharide, LPS and flagellin in cultured communities, thereby enhancing the proinflammatory potential of SSL-selected bacterial communities.

17.
Brief Funct Genomics ; 19(4): 292-308, 2020 07 29.
Artículo en Inglés | MEDLINE | ID: mdl-32353112

RESUMEN

The advent of high-resolution chromosome conformation capture assays (such as 5C, Hi-C and Pore-C) has allowed for unprecedented sequence-level investigations into the structure-function relationship of the genome. In order to comprehensively understand this relationship, computational tools are required that utilize data generated from these assays to predict 3D genome organization (the 3D genome reconstruction problem). Many computational tools have been developed that answer this need, but a comprehensive comparison of their underlying algorithmic approaches has not been conducted. This manuscript provides a comprehensive review of the existing computational tools (from November 2006 to September 2019, inclusive) that can be used to predict 3D genome organizations from high-resolution chromosome conformation capture data. Overall, existing tools were found to use a relatively small set of algorithms from one or more of the following categories: dimensionality reduction, graph/network theory, maximum likelihood estimation (MLE) and statistical modeling. Solutions in each category are far from maturity, and the breadth and depth of various algorithmic categories have not been fully explored. While the tools for predicting 3D structure for a genomic region or single chromosome are diverse, there is a general lack of algorithmic diversity among computational tools for predicting the complete 3D genome organization from high-resolution chromosome conformation capture data.


Asunto(s)
Cromatina/metabolismo , Cromosomas/química , Cromosomas/metabolismo , Biología Computacional/métodos , Genoma , Genómica/métodos , Conformación Molecular , Algoritmos , Animales , Cromatina/química , Cromatina/genética , Cromosomas/genética , Humanos
18.
BMC Genomics ; 21(Suppl 2): 273, 2020 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-32299356

RESUMEN

BACKGROUND: Computationally inferred ancestral genomes play an important role in many areas of genome research. We present an improved workflow for the reconstruction from highly diverged genomes such as those of plants. RESULTS: Our work relies on an established workflow in the reconstruction of ancestral plants, but improves several steps of this process. Instead of using gene annotations for inferring the genome content of the ancestral sequence, we identify genomic markers through a process called genome segmentation. This enables us to reconstruct the ancestral genome from hundreds of thousands of markers rather than the tens of thousands of annotated genes. We also introduce the concept of local genome rearrangement, through which we refine syntenic blocks before they are used in the reconstruction of contiguous ancestral regions. With the enhanced workflow at hand, we reconstruct the ancestral genome of eudicots, a major sub-clade of flowering plants, using whole genome sequences of five modern plants. CONCLUSIONS: Our reconstructed genome is highly detailed, yet its layout agrees well with that reported in Badouin et al. (2017). Using local genome rearrangement, not only the marker-based, but also the gene-based reconstruction of the eudicot ancestor exhibited increased genome content, evidencing the power of this novel concept.


Asunto(s)
Mapeo Cromosómico/métodos , Genómica/métodos , Magnoliopsida/genética , Simulación por Computador , Evolución Molecular , Orden Génico , Genoma de Planta , Modelos Genéticos , Filogenia , Sintenía/genética
19.
BMC Bioinformatics ; 21(1): 73, 2020 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-32093610

RESUMEN

BACKGROUND: The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. Thus, eliciting chromatin conformation is important, yet challenging due to compaction, dynamics and scale. However, a variety of recent assays, in particular Hi-C, have generated new details of chromatin structure, spawning a number of novel biological findings. Many findings have resulted from analyses on the level of native contact data as generated by the assays. Alternatively, reconstruction based approaches often proceed by first converting contact frequencies into distances, then generating a three dimensional (3D) chromatin configuration that best recapitulates these distances. Subsequent analyses can enrich contact level analyses via superposition of genomic attributes on the reconstruction. But, such advantages depend on the accuracy of the reconstruction which, absent gold standards, is inherently difficult to assess. Attempts at accuracy evaluation have relied on simulation and/or FISH imaging that typically features a handful of low resolution probes. While newly advanced multiplexed FISH imaging offers possibilities for refined 3D reconstruction accuracy evaluation, availability of such data is limited due to assay complexity and the resolution thereof is appreciably lower than the reconstructions being assessed. Accordingly, there is demand for new methods of reconstruction accuracy appraisal. RESULTS: Here we explore the potential of recently proposed stationary distributions, hereafter StatDns, derived from Hi-C contact matrices, to serve as a basis for reconstruction accuracy assessment. Current usage of such StatDns has focussed on the identification of highly interactive regions (HIRs): computationally defined regions of the genome purportedly involved in numerous long-range intra-chromosomal contacts. Consistent identification of HIRs would be informative with respect to inferred 3D architecture since the corresponding regions of the reconstruction would have an elevated number of k nearest neighbors (kNNs). More generally, we anticipate a monotone decreasing relationship between StatDn values and kNN distances. After initially evaluating the reproducibility of StatDns across replicate Hi-C data sets, we use this implied StatDn - kNN relationship to gauge the utility of StatDns for reconstruction validation, making recourse to both real and simulated examples. CONCLUSIONS: Our analyses demonstrate that, as constructed, StatDns do not provide a suitable measure for assessing the accuracy of 3D genome reconstructions. Whether this is attributable to specific choices surrounding normalization in defining StatDns or to the logic underlying their very formulation remains to be determined.


Asunto(s)
Cromatina/química , Cromosomas , Genoma , Genómica/métodos , Conformación Molecular , Reproducibilidad de los Resultados
20.
Microbiol Res ; 233: 126407, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31945518

RESUMEN

Lichens have been widely studied for their symbiotic properties and for the secondary metabolites production by its fungal symbiont. Recent molecular studies have confirmed coexistence of bacteria along with the fungal and algal symbionts. Direct nucleic acid study by -omics approaches is providing better insights into their structural and functional dynamics. However, genomic analysis of individual members of lichen is difficult by the conventional approach. Hence, genome assembly from metagenome data needs standardization in the eukaryotic system like lichens. The present study aimed at metagenomic characterization of rock associated lichen Dirinaria collected from Kutch and Dang regions of Gujarat, followed by genome reconstruction and annotation of the mycobiont Dirinaria. The regions considered in the study are eco-geographically highly variant. The results revealed higher alpha diversity in the dry region Kutch as compared to the tropical forest associated lichen from Dang. Ascomycota was the most abundant eukaryote while Proteobacteria dominated the bacterial population. There were 23 genera observed only in the Kutch lichen (KL) and one genus viz., Candidatus Vecturithrix unique to the Dang lichen (DL). The exclusive bacterial genera in the Kutch mostly belonged to groups reported for stress tolerance and earlier isolated from lithobionts of extreme niches. The assembled data of KL & DL were further used for genome reconstruction of Dirinaria sp. using GC and tetra-pentamer parameters and reassembly that resulted into a final draft genome of 31.7 Mb and 9556 predicted genes. Twenty-eight biosynthesis gene clusters were predicted that included genes for polyketide, indole and terpene synthesis. Association analysis of bacteria and mycobiont revealed 8 pathways specific to bacteria with implications in lichen symbiosis and environment interaction. The study provides the first draft genome of the entire fungal Dirinaria genus and provides insights into the Dirinaria lichen metagenome from Gujarat region.


Asunto(s)
Bacterias/clasificación , Ecosistema , Hongos/genética , Líquenes/genética , Metagenoma , Ascomicetos/genética , Vías Biosintéticas , Genómica , Familia de Multigenes , Filogenia , Proteobacteria/genética , Análisis de Secuencia de ADN , Simbiosis/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA