Your browser doesn't support javascript.
loading
BugMat and FindNeighbour: command line and server applications for investigating bacterial relatedness.
Mazariegos-Canellas, Oriol; Do, Trien; Peto, Tim; Eyre, David W; Underwood, Anthony; Crook, Derrick; Wyllie, David H.
Afiliación
  • Mazariegos-Canellas O; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.
  • Do T; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.
  • Peto T; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.
  • Eyre DW; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.
  • Underwood A; Public Health England, 61 Colindale Avenue, London, NW9 5EQ, UK.
  • Crook D; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.
  • Wyllie DH; Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK. David.wyllie@ndm.ox.ac.uk.
BMC Bioinformatics ; 18(1): 477, 2017 Nov 13.
Article en En | MEDLINE | ID: mdl-29132318
BACKGROUND: Large scale bacterial sequencing has made the determination of genetic relationships within large sequence collections of bacterial genomes derived from the same microbial species an increasingly common task. Solutions to the problem have application to public health (for example, in the detection of possible disease transmission), and as part of divide-and-conquer strategies selecting groups of similar isolates for computationally intensive methods of phylogenetic inference using (for example) maximal likelihood methods. However, the generation and maintenance of distance matrices is computationally intensive, and rapid methods of doing so are needed to allow translation of microbial genomics into public health actions. RESULTS: We developed, tested and deployed three solutions. BugMat is a fast C++ application which generates one-off in-memory distance matrices. FindNeighbour and FindNeighbour2 are server-side applications which build, maintain, and persist either complete (for FindNeighbour) or sparse (for FindNeighbour2) distance matrices given a set of sequences. FindNeighbour and BugMat use a variation model to accelerate computation, while FindNeighbour2 uses reference-based compression. Performance metrics show scalability into tens of thousands of sequences, with options for scaling further. CONCLUSION: Three applications, each with distinct strengths and weaknesses, are available for distance-matrix based analysis of large bacterial collections. Deployed as part of the Public Health England solution for M. tuberculosis genomic processing, they will have wide applicability.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Filogenia / Bacterias / Programas Informáticos / Genoma Bacteriano / Genómica Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2017 Tipo del documento: Article Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Filogenia / Bacterias / Programas Informáticos / Genoma Bacteriano / Genómica Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2017 Tipo del documento: Article Pais de publicación: Reino Unido