Búsqueda | Portal Regional de la BVS

SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies.

Hu, Heng; Gao, Runtian; Gao, Wentao; Gao, Bo; Jiang, Zhongjun; Zhou, Murong; Wang, Guohua; Jiang, Tao.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38980375

RESUMEN

Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.

Asunto(s)

Algoritmos , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Genómica/métodos , Variación Estructural del Genoma , Programas Informáticos

ChimeraMiner: An Improved Chimeric Read Detection Pipeline and Its Application in Single Cell Sequencing.

Lu, Na; Li, Junji; Bi, Changwei; Guo, Jing; Tao, Yuhan; Luan, Kaihao; Tu, Jing; Lu, Zuhong.

Int J Mol Sci ; 20(8)2019 Apr 21.

Artículo en Inglés | MEDLINE | ID: mdl-31010074

RESUMEN

As the most widely-used single cell whole genome amplification (WGA) approach, multiple displacement amplification (MDA) has a superior performance, due to the high-fidelity and processivity of phi29 DNA polymerase. However, chimeric reads, generated in MDA, cause severe disruption in many single-cell studies. Herein, we constructed ChimeraMiner, an improved chimeric read detection pipeline for analyzing the sequencing data of MDA and classified the chimeric sequences. Two datasets (MDA1 and MDA2) were used for evaluating and comparing the efficiency of ChimeraMiner and previous pipeline. Under the same hardware condition, ChimeraMiner spent only 43.4% (43.8% for MDA1 and 43.0% for MDA2) processing time. Respectively, 24.4 million (6.31%) read pairs out of 773 million reads, and 17.5 million (6.62%) read pairs out of 528 million reads were accurately classified as chimeras by ChimeraMiner. In addition to finding 83.60% (17,639,371) chimeras, which were detected by previous pipelines, ChimeraMiner screened 6,736,168 novel chimeras, most of which were missed by the previous pipeline. Applying in single-cell datasets, all three types of chimera were discovered in each dataset, which introduced plenty of false positives in structural variation (SV) detection. The identification and filtration of chimeras by ChimeraMiner removed most of the false positive SVs (83.8%). ChimeraMiner revealed improved efficiency in discovering chimeric reads, and is promising to be widely used in single-cell sequencing.

Asunto(s)

Quimera/genética , Biología Computacional/métodos , Programas Informáticos , Genoma Humano , Humanos , Análisis de la Célula Individual , Secuenciación Completa del Genoma

Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Dzakula, Zeljko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan.

Genetics ; 202(1): 351-62, 2016 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-26510793

RESUMEN

Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.

Asunto(s)

Mapeo Cromosómico , Variación Estructural del Genoma , Análisis por Micromatrices/métodos , Línea Celular , Genoma Humano , Humanos

Making the difference: integrating structural variation detection tools.

Lin, Ke; Smit, Sandra; Bonnema, Guusje; Sanchez-Perez, Gabino; de Ridder, Dick.

Brief Bioinform ; 16(5): 852-64, 2015 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-25504367

RESUMEN

From prokaryotes to eukaryotes, phenotypic variation, adaptation and speciation has been associated with structural variation between genomes of individuals within the same species. Many computer algorithms detecting such variations (callers) have recently been developed, spurred by the advent of the next-generation sequencing technology. Such callers mainly exploit split-read mapping or paired-end read mapping. However, as different callers are geared towards different types of structural variation, there is still no single caller that can be considered a community standard; instead, increasingly the various callers are combined in integrated pipelines. In this article, we review a wide range of callers, discuss challenges in the integration step and present a survey of pipelines used in population genomics studies. Based on our findings, we provide general recommendations on how to set-up such pipelines. Finally, we present an outlook on future challenges in structural variation detection.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Interpretación Estadística de Datos , Genética de Población , Genoma

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA