Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Cells ; 12(10)2023 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-37408191

RESUMO

Architectural proteins are essential epigenetic regulators that play a critical role in organizing chromatin and controlling gene expression. CTCF (CCCTC-binding factor) is a key architectural protein responsible for maintaining the intricate 3D structure of chromatin. Because of its multivalent properties and plasticity to bind various sequences, CTCF is similar to a Swiss knife for genome organization. Despite the importance of this protein, its mechanisms of action are not fully elucidated. It has been hypothesized that its versatility is achieved through interaction with multiple partners, forming a complex network that regulates chromatin folding within the nucleus. In this review, we delve into CTCF's interactions with other molecules involved in epigenetic processes, particularly histone and DNA demethylases, as well as several long non-coding RNAs (lncRNAs) that are able to recruit CTCF. Our review highlights the importance of CTCF partners to shed light on chromatin regulation and pave the way for future exploration of the mechanisms that enable the finely-tuned role of CTCF as a master regulator of chromatin.


Assuntos
Cromatina , DNA , Fator de Ligação a CCCTC/genética , DNA/metabolismo , Núcleo Celular/metabolismo , Genoma
2.
J Exp Bot ; 74(10): 3240-3254, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36880316

RESUMO

Natural plant populations are polymorphic and show intraspecific variation in resistance properties against pathogens. The activation of the underlying defence responses can depend on variation in perception of pathogen-associated molecular patterns or elicitors. To dissect such variation, we evaluated the responses induced by laminarin (a glucan, representing an elicitor from oomycetes) in the wild tomato species Solanum chilense and correlated this to observed infection frequencies of Phytophthora infestans. We measured reactive oxygen species burst and levels of diverse phytohormones upon elicitation in 83 plants originating from nine populations. We found high diversity in basal and elicitor-induced levels of each component. Further we generated linear models to explain the observed infection frequency of P. infestans. The effect of individual components differed dependent on the geographical origin of the plants. We found that the resistance in the southern coastal region, but not in the other regions, was directly correlated to ethylene responses and confirmed this positive correlation using ethylene inhibition assays. Our findings reveal high diversity in the strength of defence responses within a species and the involvement of different components with a quantitatively different contribution of individual components to resistance in geographically separated populations of a wild plant species.


Assuntos
Phytophthora infestans , Solanum lycopersicum , Solanum tuberosum , Solanum , Etilenos , Glucanos , Phytophthora infestans/fisiologia , Doenças das Plantas
3.
NPJ Syst Biol Appl ; 8(1): 5, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-35132075

RESUMO

High-grade serous ovarian carcinoma (HGSC) is the most lethal gynecologic malignancy due to the lack of reliable biomarkers, effective treatment, and chemoresistance. Improving the diagnosis and the development of targeted therapies is still needed. The molecular pathomechanisms driving HGSC progression are not fully understood though crucial for effective diagnosis and identification of novel targeted therapy options. The oncogene CTCFL (BORIS), the paralog of CTCF, is a transcriptional factor highly expressed in ovarian cancer (but in rarely any other tissue in females) with cancer-specific characteristics and therapeutic potential. In this work, we seek to understand the regulatory functions of CTCFL to unravel new target genes with clinical relevance. We used in vitro models to evaluate the transcriptional changes due to the presence of CTCFL, followed by a selection of gene candidates using de novo network enrichment analysis. The resulting mechanistic candidates were further assessed regarding their prognostic potential and druggability. We show that CTCFL-driven genes are involved in cytoplasmic membrane functions; in particular, the PI3K-Akt initiators EGFR1 and VEGFA, as well as ITGB3 and ITGB6 are potential drug targets. Finally, we identified the CTCFL targets ACTBL2, MALT1 and PCDH7 as mechanistic biomarkers to predict survival in HGSC. Finally, we elucidated the value of CTCFL in combination with its targets as a prognostic marker profile for HGSC progression and as putative drug targets.


Assuntos
Proteínas de Ligação a DNA , Neoplasias Ovarianas , Proteínas de Ligação a DNA/genética , Feminino , Humanos , Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/metabolismo , Fosfatidilinositol 3-Quinases/genética , Proteínas Proto-Oncogênicas c-akt/genética , Transdução de Sinais , Fatores de Transcrição
4.
NPJ Syst Biol Appl ; 7(1): 21, 2021 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-34031419

RESUMO

COVID-19 is an infection caused by SARS-CoV-2 (Severe Acute Respiratory Syndrome coronavirus 2), which has caused a global outbreak. Current research efforts are focused on the understanding of the molecular mechanisms involved in SARS-CoV-2 infection in order to propose drug-based therapeutic options. Transcriptional changes due to epigenetic regulation are key host cell responses to viral infection and have been studied in SARS-CoV and MERS-CoV; however, such changes are not fully described for SARS-CoV-2. In this study, we analyzed multiple transcriptomes obtained from cell lines infected with MERS-CoV, SARS-CoV, and SARS-CoV-2, and from COVID-19 patient-derived samples. Using integrative analyses of gene co-expression networks and de-novo pathway enrichment, we characterize different gene modules and protein pathways enriched with Transcription Factors or Epifactors relevant for SARS-CoV-2 infection. We identified EP300, MOV10, RELA, and TRIM25 as top candidates, and more than 60 additional proteins involved in the epigenetic response during viral infection that has therapeutic potential. Our results show that targeting the epigenetic machinery could be a feasible alternative to treat COVID-19.


Assuntos
COVID-19/genética , Epigênese Genética/genética , SARS-CoV-2/genética , Transcriptoma/genética , COVID-19/virologia , Perfilação da Expressão Gênica , Humanos , Coronavírus da Síndrome Respiratória do Oriente Médio/genética , Coronavírus da Síndrome Respiratória do Oriente Médio/patogenicidade , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/genética , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/patogenicidade , SARS-CoV-2/patogenicidade , Transdução de Sinais/genética
5.
Cancers (Basel) ; 12(10)2020 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-33007869

RESUMO

Inflammatory breast cancer (IBC) is a rare and aggressive type of breast cancer whose molecular basis is poorly understood. We performed a comprehensive molecular analysis of 24 IBC biopsies naïve of treatment, using a high-resolution microarray platform and targeted next-generation sequencing (105 cancer-related genes). The genes more frequently affected by gains were MYC (75%) and MDM4 (71%), while frequent losses encompassed TP53 (71%) and RB1 (58%). Increased MYC and MDM4 protein expression levels were detected in 18 cases. These genes have been related to IBC aggressiveness, and MDM4 is a potential therapeutic target in IBC. Functional enrichment analysis revealed genes associated with inflammatory regulation and immune response. High homologous recombination (HR) deficiency scores were detected in triple-negative and metastatic IBC cases. A high telomeric allelic imbalance score was found in patients having worse overall survival (OS). The mutational profiling was compared with non-IBC (TCGA, n = 250) and IBC (n = 118) from four datasets, validating our findings. Higher frequency of TP53 and BRCA2 variants were detected compared to non-IBC, while PIKC3A showed similar frequency. Variants in mismatch repair and HR genes were associated with worse OS. Our study provided a framework for improved diagnosis and therapeutic alternatives for this aggressive tumor type.

6.
Sci Data ; 7(1): 142, 2020 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393779

RESUMO

We present the newest version of CoryneRegNet, the reference database for corynebacterial regulatory interactions, available at www.exbio.wzw.tum.de/coryneregnet/. The exponential growth of next-generation sequencing data in recent years has allowed a better understanding of bacterial molecular mechanisms. Transcriptional regulation is one of the most important mechanisms for bacterial adaptation and survival. These mechanisms may be understood via an organism's network of regulatory interactions. Although the Corynebacterium genus is important in medical, veterinary and biotechnological research, little is known concerning the transcriptional regulation of these bacteria. Here, we unravel transcriptional regulatory networks (TRNs) for 224 corynebacterial strains by utilizing genome-scale transfer of TRNs from four model organisms and assigning statistical significance values to all predicted regulations. As a result, the number of corynebacterial strains with TRNs increased twenty times and the back-end and front-end were reimplemented to support new features as well as future database growth. CoryneRegNet 7 is the largest TRN database for the Corynebacterium genus and aids in elucidating transcriptional mechanisms enabling adaptation, survival and infection.


Assuntos
Corynebacterium/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Bases de Dados Genéticas , Conjuntos de Dados como Assunto
7.
Oncogenesis ; 8(8): 41, 2019 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-31406110

RESUMO

The identification of prognostic biomarkers is a priority for patients suffering from high-grade serous ovarian cancer (SOC), which accounts for >70% of ovarian cancer (OC) deaths. Meanwhile, borderline ovarian cancer (BOC) is a low malignancy tumor and usually patients undergo surgery with low probabilities of recurrence. However, SOC remains the most lethal neoplasm due to the lack of biomarkers for early diagnosis and prognosis. In this regard, BORIS (CTCFL), a CTCF paralog, is a promising cancer biomarker that is overexpressed and controls transcription in several cancer types, mainly in OC. Studies suggest that BORIS has an important function in OC by altering gene expression, but the effect and extent to which BORIS influences transcription in OC from a genome-wide perspective is unclear. Here, we sought to identify BORIS target genes in an OC cell line (OVCAR3) with potential biomarker use in OC tumor samples. To achieve this, we performed in vitro knockout and knockdown experiments of BORIS in OVCAR3 cell line followed by expression microarrays and bioinformatics network enrichment analysis to identify relevant BORIS target genes. In addition, ex vivo expression data analysis of 373 ovarian cancer patients were evaluated to identify the expression patterns of BORIS target genes. In vitro, we uncovered 130 differentially expressed genes and obtained the BORIS-associated regulatory network, in which the androgen receptor (AR) acts as a major transcription factor. Also, FN1, FAM129A, and CD97 genes, which are related to chemoresistance and metastases in OC, were identified. In SOC patients, we observed that malignancy is associated with high levels of BORIS expression while BOC patients show lower levels. Our study suggests that BORIS acts as a main regulator, and has the potential to be used as a prognostic biomarker and to yield novel drug targets among the genes BORIS controls in SOC patients.

8.
Front Oncol ; 9: 395, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31192117

RESUMO

Pre-operative 5-fluoracil-based chemoradiotherapy (nCRT) is the standard treatment for patients with locally advanced rectal cancer (LARC). Patients with pathological complete response (pCR-0% of tumor cells in the surgical specimen after nCRT) have better overall survival and lower risk of recurrence in comparison with incomplete responders (pIR). Predictive biomarkers to be used for new therapeutic strategies and capable of stratifying patients to avoid overtreatment are needed. We evaluated the genomic profiles of 33 pre-treatment LARC biopsies using SNP array and targeted-next generation sequencing (tNGS). Based on the large number of identified genomic alterations, we calculated the genomic instability index (GII) and three homologous recombination deficiency (HRD) scores, which have been reported as impaired DNA repair markers. We observed high GII in our LARC cases, which was confirmed in 165 rectal cancer cases from TCGA. Patients with pCR presented higher GII compared with pIR. Moreover, a negative correlation between GII and the fraction of tumor cells remaining after surgery was observed (ρ = -0.382, P = 0.02). High HRD scores were detected in 61% of LARC, of which 70% were incomplete responders. Using tNGS (105 cancer-related genes, 13 involved in HR and 5 in mismatch repair pathways), we identified 23% of cases with mutations in HR genes, mostly in pIR cases (86% of mutated cases). In agreement, the analysis of the TCGA dataset (N = 145) revealed 21% of tumors with mutations in HR genes. The HRD scores were shown to be predictive of better response to PARP-inhibitors and platinum-based chemotherapy in breast and ovarian cancer. Our results suggest that the same strategy could be applied in a set of LARC patients with HRD. In conclusion, we identified high genomic instability in LARC, which was related to alterations in the HR pathway, especially in pIR. These findings suggest that patients with impaired HRD would clinically benefit from PARP-inhibitors and platinum-based therapy.

9.
BMC Cancer ; 19(1): 422, 2019 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-31060523

RESUMO

BACKGROUND: Ovarian carcinomas presenting homologous recombination deficiency (HRD), which is observed in about 50% of cases, are more sensitive to platinum and PARP inhibitor therapies. Although platinum resistant disease has a low chance to be responsive to platinum-based chemotherapy, a set of patients is retreated with platinum and some of them are responsive. In this study, we evaluated copy number alterations, HR gene mutations and HR deficiency scores in ovarian cancer patients with prolonged platinum sensitivity. METHODS: In this retrospective study (2005 to 2014), we selected 31 patients with platinum resistant ovarian cancer retreated with platinum therapy. Copy number alterations and HR scores were evaluated using the OncoScan® FFPE platform in 15 cases. The mutational profile of 24 genes was investigated by targeted-NGS. RESULTS: The median values of the four HRD scores were higher in responders (LOH = 15, LST = 28, tAI = 33, CS = 84) compared with non-responders (LOH = 7.5, LST = 17.5, tAI = 23, CS = 47). Patients with high LOH, LST, tAI and CS scores had better response rates, although these differences were not statistically significant. Response rate to platinum retreatment was 22% in patients with CCNE1 gains and 83.5% in patients with no CCNE1 gains (p = 0.041). Furthermore, response rate was 54.5% in patients with RB1 loss and 25% in patients without RB1 loss (p = 0.569). Patients with CCNE1 gains showed a worse progression free survival (PFS = 11.1 months vs 3.7 months; p = 0.008) and a shorter overall survival (OS = 39.3 months vs 7.1 months; p = 0.007) in comparison with patients with no CCNE1 gains. Patients with RB1 loss had better PFS (9.0 months vs 2.6 months; p = 0.093) and OS (27.4 months vs 3.6 months; p = 0.025) compared with cases with no RB1 loss. Four tumor samples were BRCA mutated and tumor mutations were not associated with response to treatment. CONCLUSIONS: HR deficiency was found in 60% of our cases and HRD medium values were higher in responders than in non-responders. Despite the small number of patients tested, CCNE1 gain and RB1 loss discriminate patients with tumors extremely sensitive to platinum retreatment.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/farmacologia , Ciclina E/genética , Resistencia a Medicamentos Antineoplásicos/genética , Proteínas Oncogênicas/genética , Neoplasias Ovarianas/genética , Compostos de Platina/farmacologia , Proteínas de Ligação a Retinoblastoma/genética , Ubiquitina-Proteína Ligases/genética , Idoso , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Brasil/epidemiologia , Pré-Escolar , Variações do Número de Cópias de DNA/genética , Análise Mutacional de DNA , Feminino , Recombinação Homóloga/genética , Humanos , Pessoa de Meia-Idade , Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/mortalidade , Compostos de Platina/uso terapêutico , Intervalo Livre de Progressão , Retratamento , Estudos Retrospectivos , Análise de Sobrevida
10.
PLoS One ; 12(10): e0186401, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29049350

RESUMO

Corynebacterium diphtheriae (Cd) is a Gram-positive human pathogen responsible for diphtheria infection and once regarded for high mortalities worldwide. The fatality gradually decreased with improved living standards and further alleviated when many immunization programs were introduced. However, numerous drug-resistant strains emerged recently that consequently decreased the efficacy of current therapeutics and vaccines, thereby obliging the scientific community to start investigating new therapeutic targets in pathogenic microorganisms. In this study, our contributions include the prediction of modelome of 13 C. diphtheriae strains, using the MHOLline workflow. A set of 463 conserved proteins were identified by combining the results of pangenomics based core-genome and core-modelome analyses. Further, using subtractive proteomics and modelomics approaches for target identification, a set of 23 proteins was selected as essential for the bacteria. Considering human as a host, eight of these proteins (glpX, nusB, rpsH, hisE, smpB, bioB, DIP1084, and DIP0983) were considered as essential and non-host homologs, and have been subjected to virtual screening using four different compound libraries (extracted from the ZINC database, plant-derived natural compounds and Di-terpenoid Iso-steviol derivatives). The proposed ligand molecules showed favorable interactions, lowered energy values and high complementarity with the predicted targets. Our proposed approach expedites the selection of C. diphtheriae putative proteins for broad-spectrum development of novel drugs and vaccines, owing to the fact that some of these targets have already been identified and validated in other organisms.


Assuntos
Corynebacterium diphtheriae/patogenicidade , Antibacterianos/farmacologia , Proteínas de Bactérias/metabolismo , Vacinas Bacterianas/farmacologia , Simulação por Computador , Corynebacterium diphtheriae/efeitos dos fármacos , Corynebacterium diphtheriae/genética , Corynebacterium diphtheriae/metabolismo , Genoma Bacteriano , Humanos , Ligantes , Modelos Biológicos , Simulação de Acoplamento Molecular
11.
BMC Genomics ; 16: 452, 2015 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-26062809

RESUMO

BACKGROUND: Organisms utilize a multitude of mechanisms for responding to changing environmental conditions, maintaining their functional homeostasis and to overcome stress situations. One of the most important mechanisms is transcriptional gene regulation. In-depth study of the transcriptional gene regulatory network can lead to various practical applications, creating a greater understanding of how organisms control their cellular behavior. DESCRIPTION: In this work, we present a new database, CMRegNet for the gene regulatory networks of Corynebacterium glutamicum ATCC 13032 and Mycobacterium tuberculosis H37Rv. We furthermore transferred the known networks of these model organisms to 18 other non-model but phylogenetically close species (target organisms) of the CMNR group. In comparison to other network transfers, for the first time we utilized two model organisms resulting into a more diverse and complete network of the target organisms. CONCLUSION: CMRegNet provides easy access to a total of 3,103 known regulations in C. glutamicum ATCC 13032 and M. tuberculosis H37Rv and to 38,940 evolutionary conserved interactions for 18 non-model species of the CMNR group. This makes CMRegNet to date the most comprehensive database of regulatory interactions of CMNR bacteria. The content of CMRegNet is publicly available online via a web interface found at http://lgcm.icb.ufmg.br/cmregnet .


Assuntos
Corynebacterium glutamicum/genética , Bases de Dados Genéticas , Redes Reguladoras de Genes , Mycobacterium tuberculosis/genética , Biologia Computacional , Corynebacterium glutamicum/classificação , Regulação Bacteriana da Expressão Gênica , Genes Bacterianos , Internet , Mycobacterium tuberculosis/classificação , Filogenia
12.
World J Biol Chem ; 5(2): 161-8, 2014 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-24921006

RESUMO

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

13.
PLoS One ; 7(2): e30848, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22355329

RESUMO

The adaptability of pathogenic bacteria to hosts is influenced by the genomic plasticity of the bacteria, which can be increased by such mechanisms as horizontal gene transfer. Pathogenicity islands play a major role in this type of gene transfer because they are large, horizontally acquired regions that harbor clusters of virulence genes that mediate the adhesion, colonization, invasion, immune system evasion, and toxigenic properties of the acceptor organism. Currently, pathogenicity islands are mainly identified in silico based on various characteristic features: (1) deviations in codon usage, G+C content or dinucleotide frequency and (2) insertion sequences and/or tRNA genetic flanking regions together with transposase coding genes. Several computational techniques for identifying pathogenicity islands exist. However, most of these techniques are only directed at the detection of horizontally transferred genes and/or the absence of certain genomic regions of the pathogenic bacterium in closely related non-pathogenic species. Here, we present a novel software suite designed for the prediction of pathogenicity islands (pathogenicity island prediction software, or PIPS). In contrast to other existing tools, our approach is capable of utilizing multiple features for pathogenicity island detection in an integrative manner. We show that PIPS provides better accuracy than other available software packages. As an example, we used PIPS to study the veterinary pathogen Corynebacterium pseudotuberculosis, in which we identified seven putative pathogenicity islands.


Assuntos
Bactérias/genética , Bactérias/patogenicidade , Infecções Bacterianas/patologia , Ilhas Genômicas/genética , Software , Virulência/genética , Infecções Bacterianas/genética , Infecções Bacterianas/microbiologia , Biologia Computacional , Genoma Bacteriano
14.
PLoS One ; 6(4): e18551, 2011 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-21533164

RESUMO

BACKGROUND: Corynebacterium pseudotuberculosis, a gram-positive, facultative intracellular pathogen, is the etiologic agent of the disease known as caseous lymphadenitis (CL). CL mainly affects small ruminants, such as goats and sheep; it also causes infections in humans, though rarely. This species is distributed worldwide, but it has the most serious economic impact in Oceania, Africa and South America. Although C. pseudotuberculosis causes major health and productivity problems for livestock, little is known about the molecular basis of its pathogenicity. METHODOLOGY AND FINDINGS: We characterized two C. pseudotuberculosis genomes (Cp1002, isolated from goats; and CpC231, isolated from sheep). Analysis of the predicted genomes showed high similarity in genomic architecture, gene content and genetic order. When C. pseudotuberculosis was compared with other Corynebacterium species, it became evident that this pathogenic species has lost numerous genes, resulting in one of the smallest genomes in the genus. Other differences that could be part of the adaptation to pathogenicity include a lower GC content, of about 52%, and a reduced gene repertoire. The C. pseudotuberculosis genome also includes seven putative pathogenicity islands, which contain several classical virulence factors, including genes for fimbrial subunits, adhesion factors, iron uptake and secreted toxins. Additionally, all of the virulence factors in the islands have characteristics that indicate horizontal transfer. CONCLUSIONS: These particular genome characteristics of C. pseudotuberculosis, as well as its acquired virulence factors in pathogenicity islands, provide evidence of its lifestyle and of the pathogenicity pathways used by this pathogen in the infection process. All genomes cited in this study are available in the NCBI Genbank database (http://www.ncbi.nlm.nih.gov/genbank/) under accession numbers CP001809 and CP001829.


Assuntos
Corynebacterium pseudotuberculosis/patogenicidade , Evolução Molecular , Genoma Bacteriano , Virulência/genética , Corynebacterium pseudotuberculosis/genética
15.
J Microbiol Methods ; 86(2): 218-23, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21620904

RESUMO

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former.


Assuntos
Biologia Computacional/métodos , Corynebacterium pseudotuberculosis/genética , DNA Bacteriano/genética , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/métodos , DNA Bacteriano/química
16.
BMC Res Notes ; 4: 130, 2011 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-21501521

RESUMO

BACKGROUND: Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. FINDINGS: We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. CONCLUSIONS: Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.

17.
BMC Genomics ; 12 Suppl 4: S11, 2011 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-22369633

RESUMO

BACKGROUND: Singular value decomposition (SVD) is a powerful technique for information retrieval; it helps uncover relationships between elements that are not prima facie related. SVD was initially developed to reduce the time needed for information retrieval and analysis of very large data sets in the complex internet environment. Since information retrieval from large-scale genome and proteome data sets has a similar level of complexity, SVD-based methods could also facilitate data analysis in this research area. RESULTS: We found that SVD applied to amino acid sequences demonstrates relationships and provides a basis for producing clusters and cladograms, demonstrating evolutionary relatedness of species that correlates well with Linnaean taxonomy. The choice of a reasonable number of singular values is crucial for SVD-based studies. We found that fewer singular values are needed to produce biologically significant clusters when SVD is employed. Subsequently, we developed a method to determine the lowest number of singular values and fewest clusters needed to guarantee biological significance; this system was developed and validated by comparison with Linnaean taxonomic classification. CONCLUSIONS: By using SVD, we can reduce uncertainty concerning the appropriate rank value necessary to perform accurate information retrieval analyses. In tests, clusters that we developed with SVD perfectly matched what was expected based on Linnaean taxonomy.


Assuntos
Algoritmos , Análise por Conglomerados , Armazenamento e Recuperação da Informação , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA