簡易檢索 / 詳目顯示

研究生: 徐淑妮
Hsu, Shu-Ni
論文名稱: Comparison of Algorithms for Copy Number Variation Detection
偵測基因重複數方法之比較
指導教授: 謝文萍
Hsieh, Wen-Ping
口試委員:
學位類別: 碩士
Master
系所名稱: 理學院 - 統計學研究所
Institute of Statistics
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 42
中文關鍵詞: SNP 晶片染色體片段套數變異
外文關鍵詞: SNP array, copy number variation
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Abstract

    Copy number variation is one of the common types of structure variation in the human genome. There have been many technologies developed to detect copy number variations and one of them is the SNP array. With the progression of chip technology, we can quickly obtain genome-wide information. A number of algorithms proposed to efficiently extract chromosome copy number variation information from the huge SNP data. The goal of this study is to know how these CNV detection algorithms perform. In this article, we compare four algorithms on HapMap 270 samples from Affymetrix SNP 6.0 array data. The performance is evaluated based on the consistency of the trios and the concordance rate with two published data. According to our analysis, there is not a single CNV detection algorithm outperforming all the other CNV calling methods.


    1. INTRODUCTION 1 1.1 General Background Information 1 1.2 Literature Review 1 1.3 Research Purpose 3 1.4 Overview 3 2. Method 4 2.1 Affymetrix SNP array data format 4 2.2 LRR and BAF 4 2.3 Hidden Markov Model (HMM) 6 2.4 QuantiSNP 7 2.4.1 Transition probability in the HMM 7 2.4.2 Emission probability in the HMM 8 2.4.3 Hierarchical priors used in model parameters 8 2.4.4 Model parameter estimation and the most likely path tracing. 9 2.5 PennCNV 9 2.5.1 Transition probability in the HMM 9 2.5.2 Emission probability in the HMM 10 2.5.3 Model parameter estimation and the most likely path tracing 10 2.6 GenoCNV 11 2.6.1 Transition probability in the HMM 11 2.6.2 Emission probability in the HMM 11 2.6.3 Model parameter estimation and the most likely path tracing 12 2.7 COKGEN 12 2.7.1 Normalization step 12 2.7.2 Raw copy number extraction step 12 2.7.3 The objective function 13 2.7.4 Low-pass filter and edge detection procedure to determine the initial candidate CNV regions. 14 2.7.5 Gradually improve the CNV boundaries with simulated annealing. 15 2.8 Overlapping rate 15 2.9 Data set and the software 16 3. Results 17 3.1 Summary statistics of the CNV calls 17 3.2 Consistency between parents and the child 19 3.3 Comparing the CNV calls with the detection from sequencing data 20 3.4 Comparing the CNV calls detected by the four methods with the CNV events published by McCarroll (2008) 22 3.5 Comparing the CNV calls between algorithms 25 3.6 Consistency between the CNV regions published by McCarroll (2008) and Kidd(2008) 26 4. Discussion 27

    BIOCONDUCTOR, http://www.bioconductor.org/
    COKGEN, http://mendel.gene.cwru.edu/laframboiselab/software.php
    Colella, S., C. Yau, et al. (2007). "QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data." Nucleic Acids Res 35(6): 2013-2025.
    Fellermann, K., D. E. Stange, et al. (2006). "A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon." Am J Hum Genet 79(3): 439-448.
    Feuk, L., A. R. Carson, et al. (2006). "Structural variation in the human genome." Nat Rev Genet 7(2): 85-97.
    GenoCN, http://www.bios.unc.edu/~wsun/software/genoCN.htm
    Huang, J., W. Wei, et al. (2006). "CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays." BMC Bioinformatics 7: 83.
    Kidd, J. M., G. M. Cooper, et al. (2008). "Mapping and sequencing of structural variation from eight human genomes." Nature 453(7191): 56-64.
    Korn, J. M., F. G. Kuruvilla, et al. (2008). "Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs." Nat Genet 40(10): 1253-1260.
    Laframboise, T., D. Harrington, et al. (2007). "PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data." Biostatistics 8(2): 323-336.
    McCarroll, S. A., F. G. Kuruvilla, et al. (2008). "Integrated detection and population-genetic analysis of SNPs and copy number variation." Nat Genet 40(10): 1166-1174.
    Olshen, A. B., E. S. Venkatraman, et al. (2004). "Circular binary segmentation for the analysis of array-based DNA copy number data." Biostatistics 5(4): 557-572.
    PennCNV, http://www.openbioinformatics.org/penncnv/penncnv_download.html
    PennCNV-Affy, http://www.openbioinformatics.org/penncnv/penncnv_tutorial_affy_gw6.html
    Pique-Regi, R., J. Monso-Varona, et al. (2008). "Sparse representation and Bayesian detection of genome copy number alterations from microarray data." Bioinformatics 24(3): 309-318.
    QuantiSNP, http://groups.google.co.uk/group/quantisnp/web/software-updates
    Rovelet-Lecrux, A., D. Hannequin, et al. (2006). "APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy." Nat Genet 38(1): 24-26.
    Sebat, J., B. Lakshmi, et al. (2007). "Strong association of de novo copy number mutations with autism." Science 316(5823): 445-449.
    Simon-Sanchez, J., S. Scholz, et al. (2008). "Genomewide SNP assay reveals mutations underlying Parkinson disease." Hum Mutat 29(2): 315-322.
    Sun, W., F. A. Wright, et al. (2009). "Integrated study of copy number states and genotype calls using high-density SNP arrays." Nucleic Acids Res 37(16): 5365-5377.
    Walsh, T., J. M. McClellan, et al. (2008). "Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia." Science 320(5875): 539-543.
    Wang, K., M. Li, et al. (2007). "PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data." Genome Res 17(11): 1665-1674.
    Winchester, L., C. Yau, et al. (2009). "Comparing CNV detection methods for SNP arrays." Brief Funct Genomic Proteomic 8(5): 353-366.
    Yavas, G., M. Koyuturk, et al. (2009). "An optimization framework for unsupervised identification of rare copy number variation from SNP array data." Genome Biol 10(10): R119.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE