Your browser doesn't support javascript.
loading
Genome-wide detection of somatic mosaicism at short tandem repeats.
Sehgal, Aarushi; Ziaei Jam, Helyaneh; Shen, Andrew; Gymrek, Melissa.
Afiliación
  • Sehgal A; Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, United States.
  • Ziaei Jam H; Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, United States.
  • Shen A; Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, United States.
  • Gymrek M; Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, United States.
Bioinformatics ; 40(8)2024 08 02.
Article en En | MEDLINE | ID: mdl-39078205
ABSTRACT
MOTIVATION Somatic mosaicism has been implicated in several developmental disorders, cancers, and other diseases. Short tandem repeats (STRs) consist of repeated sequences of 1-6 bp and comprise >1 million loci in the human genome. Somatic mosaicism at STRs is known to play a key role in the pathogenicity of loci implicated in repeat expansion disorders and is highly prevalent in cancers exhibiting microsatellite instability. While a variety of tools have been developed to genotype germline variation at STRs, a method for systematically identifying mosaic STRs is lacking.

RESULTS:

We introduce prancSTR, a novel method for detecting mosaic STRs from individual high-throughput sequencing datasets. prancSTR is designed to detect loci characterized by a single high-frequency mosaic allele, but can also detect loci with multiple mosaic alleles. Unlike many existing mosaicism detection methods for other variant types, prancSTR does not require a matched control sample as input. We show that prancSTR accurately identifies mosaic STRs in simulated data, demonstrate its feasibility by identifying candidate mosaic STRs in Illumina whole genome sequencing data derived from lymphoblastoid cell lines for individuals sequenced by the 1000 Genomes Project, and evaluate the use of prancSTR on Element and PacBio data. In addition to prancSTR, we present simTR, a novel simulation framework which simulates raw sequencing reads with realistic error profiles at STRs. AVAILABILITY AND IMPLEMENTATION prancSTR and simTR are freely available at https//github.com/gymrek-lab/trtools. Detailed documentation is available at https//trtools.readthedocs.io/.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genoma Humano / Repeticiones de Microsatélite / Secuenciación de Nucleótidos de Alto Rendimiento / Mosaicismo Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genoma Humano / Repeticiones de Microsatélite / Secuenciación de Nucleótidos de Alto Rendimiento / Mosaicismo Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido