Your browser doesn't support javascript.
loading
A random forest-based framework for genotyping and accuracy assessment of copy number variations.
Zhuang, Xuehan; Ye, Rui; So, Man-Ting; Lam, Wai-Yee; Karim, Anwarul; Yu, Michelle; Ngo, Ngoc Diem; Cherny, Stacey S; Tam, Paul Kwong-Hang; Garcia-Barcelo, Maria-Mercè; Tang, Clara Sze-Man; Sham, Pak Chung.
Afiliación
  • Zhuang X; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Ye R; Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • So MT; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Lam WY; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Karim A; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Yu M; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Ngo ND; National Hospital of Pediatrics, Ha Noi 100000, Vietnam.
  • Cherny SS; Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Tam PK; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Garcia-Barcelo MM; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Tang CS; Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
  • Sham PC; Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
NAR Genom Bioinform ; 2(3): lqaa071, 2020 Sep.
Article en En | MEDLINE | ID: mdl-33575619
Detection of copy number variations (CNVs) is essential for uncovering genetic factors underlying human diseases. However, CNV detection by current methods is prone to error, and precisely identifying CNVs from paired-end whole genome sequencing (WGS) data is still challenging. Here, we present a framework, CNV-JACG, for Judging the Accuracy of CNVs and Genotyping using paired-end WGS data. CNV-JACG is based on a random forest model trained on 21 distinctive features characterizing the CNV region and its breakpoints. Using the data from the 1000 Genomes Project, Genome in a Bottle Consortium, the Human Genome Structural Variation Consortium and in-house technical replicates, we show that CNV-JACG has superior sensitivity over the latest genotyping method, SV2, particularly for the small CNVs (≤1 kb). We also demonstrate that CNV-JACG outperforms SV2 in terms of Mendelian inconsistency in trios and concordance between technical replicates. Our study suggests that CNV-JACG would be a useful tool in assessing the accuracy of CNVs to meet the ever-growing needs for uncovering the missing heritability linked to CNVs.

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Clinical_trials / Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2020 Tipo del documento: Article País de afiliación: China Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Clinical_trials / Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2020 Tipo del documento: Article País de afiliación: China Pais de publicación: Reino Unido