RESUMEN
Serratia marcescens are gram-negative bacteria found in several environmental niches, including the plant rhizosphere and patients in hospitals. Here, we present the genome of Serratia marcescens strain N4-5 (=NRRL B-65519), which has a size of 5,074,473 bp (664-fold coverage) and contains 4840 protein coding genes, 21 RNA genes, and an average G + C content of 59.7%. N4-5 harbours a plasmid of 11,089 bp and 43.5% G + C content that encodes six unique CDS repeated 2.5× times totalling 13 CDS. Our genome assembly and manual curation uncovered the insertion of two extra copies of the 5S rRNA gene in the assembled sequence, which was confirmed by PCR and Sanger sequencing to be a misassembly. This artefact was subsequently removed from the final assembly. The occurrence of extra copies of the 5S rRNA gene was also observed in most complete genomes of Serratia spp. deposited in public databases in our comparative analysis. These elements, which also occur naturally, can easily be confused with true genetic variation. Efforts to discover and correct assembly artefacts should be made in order to generate genome sequences that represent the biological truth underlying the studied organism. We present the genome of N4-5 and discuss genes potentially involved in biological control activity against plant pathogens and also the possible mechanisms responsible for the artefact we observed in our initial assembly. This report raises awareness about the extra copies of the 5S rRNA gene in sequenced bacterial genomes as they may represent misassemblies and therefore should be verified experimentally.
Asunto(s)
Genoma Bacteriano , Serratia marcescens/clasificación , Serratia marcescens/genética , Secuenciación Completa del Genoma , Composición de Base , Agentes de Control Biológico , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADNRESUMEN
In this study, the full genome sequence of Bacillus velezensis strain UFLA258, a biological control agent of plant pathogens was obtained, assembled, and annotated. With a comparative genomics approach, in silico analyses of all complete genomes of B. velezensis and closely related species available in the database were performed. The genome of B. velezensis UFLA258 consisted of a single circular chromosome of 3.95 Mb in length, with a mean GC content of 46.69%. It contained 3,949 genes encoding proteins and 27 RNA genes. Analyses based on Average Nucleotide Identity and Digital DNA-DNA Hybridization and a phylogeny with complete sequences of the rpoB gene confirmed that 19 strains deposited in the database as Bacillus amyloliquefaciens were in fact B. velezensis. In total, 115 genomes were analyzed and taxonomically classified as follows: 105 were B. velezensis, 9 were B. amyloliquefaciens, and 1 was Bacillus siamensis. Although these species are phylogenetically close, the combined analyses of several genomic characteristics, such as the presence of biosynthetic genes encoding secondary metabolites, CRISPr/Cas arrays, Average Nucleotide Identity and Digital DNA-DNA Hybridization, and other information on the strains, including isolation source, allowed their unequivocal classification. This genomic analysis expands our knowledge about the closely related species, B. velezensis, B. amyloliquefaciens, and B. siamensis, with emphasis on their taxonomical status.