簡易檢索 / 詳目顯示

研究生: 吳俊穎
Wu, Chun Ying
論文名稱: 利用標準化互信息與樹成長方法預測單倍體
Normalized Mutual Information and Tree-growing Methods for Haplotype Inference
指導教授: 蘇豐文
Soo, Von Wun
口試委員: 陳宜欣
Chen, Yi Shin
陳煥宗
Chen, Hwann Tzong
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 32
中文關鍵詞: 單倍體預測
外文關鍵詞: haplotype inference
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在人類基因組計劃揭開基因序列後,單倍體是一種對於遺傳疾病研究、藥物目標預測研究上都非常具有助益的資訊,但要用臨床實驗的方法得到單倍體的資訊是非常耗時間而且在經濟上也是一大考驗,所以近年來越來越多的研究是想要從電腦模擬推斷出單倍體的資訊,希望藉由這種方式去加快得到單倍體的資訊。在這篇論文裡,推斷正確單倍體資訊的問題中,我們提出藉由導入多個單核苷酸多態性之間的相互關聯性資訊到成長樹的解決方式裡,經過這樣的方法,我們確實在推斷準確率上得到相較於簡約樹成長法最高至2.62%的成長,且在推斷正確單倍體的問題上,最高的推斷準確率達到95.2%。藉由高準確率的推斷單倍體後,我們希望可以跳過臨床實驗去取得單倍體資訊的步驟,直接拿來幫助其他相關研究。


    Haplotypes are a kind of powerful information that is helpful in gene candidate studies because of inheritance characteristics. However, in order to get the haplotype information, in vitro methods cost lots of time and money, it is helpful to infer haplotypes using in Silico methods. Because the haplotype inference is a NP-Hard problem, both the accuracy and computational time are important issues. In this thesis, we take into account of the normalized mutual information in the parsimonious tree-grow methods that show very good performance on haplotype inference problems. And we improve the inference accuracy rate most to 2.62 percent on APOE gene dataset which just spend about 0.001 more seconds than original parsimonious tree-grow method. We also have the highest to 95.2% accuracy rate on β2AR gene data in comparison to previous approaches.

    摘要 I Abstract II Chapter 1 Introduction 1 Chapter 2 Haplotype Inference Problem 4 Chapter 3 Related Research 6 3.1 Parsimonious tree-grow method 6 3.2 Linkage Disequilibrium 8 Chapter 4 Method 11 4.1 Stage 1. Initialization 11 4.2 Stage 2. Add linkage disequilibrium information into tree-grow procedure 13 4.3 Stage 3. Collect all haplotypes inferred from algorithm 19 4.4 Trying add family’s information into genotypes dataset 20 Chapter 5 Experimental Data 22 5.1 Datasets 22 5.2 Procedure of generating genotypes 23 Chapter 6 Evaluation 24 6.1 Accuracy Rate 24 6.2 Experiment on β2AR gene data 24 6.3 Experiment on Maize data (Acetyl-CoAC-acyltransferase) 25 6.4 Experiment on APOE gene data 25 6.5 Experiment on CYP19 gene data 26 6.6 Experiment on CYP19 gene data with family information 27 Chapter 7 Conclusions and Future Work 28 7.1 Conclusions 28 7.2 Future Work 29 References 30

    O’Brien, S. J., and Nelson G. W. (2004). "Human genes that limit AIDS." Nature Genetics 36 (6): 565-574
    Jablonski NG, Chaplin G. (2000). "The evolution of human skin coloration." Journal of Human Genetics. 39: 57-106
    Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, Liu Q, Cochran C, Bennett LM and Ding W. (1994). "A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1." Science 266: 66-71.
    Shastry, B. S. (2002). "SNP alleles in human disease and evolution." Journal of Human Genetics. 47: 561-566.
    Bullido, M. J. et al. (1998). "A polymorphism in the regulatory region of APOE associated with risk for Alzheimer’s dementia." Nature Genetics. 18: 69-71.
    Clark, A. G. (1990). "Inference of haplotypes from PCR-amplified samples of diploid populations." Molecular Biology and Evolution. 7: 111-122.
    Excoffier, L. and Slatkin, M. (1995). "Maximum-Likelihood Estimation of molecular haplotype frequencies in a diploid population." Molecular Biology and Evolution. 12: 921-927.
    Hawley, M. and Kidd, K. (1995). "Haplo: A program using the EM algorithm to estimate the frequencies of multi-site haplotypes." Journal of Heredity. 86: 409-411.
    Stephens, M. Smith, N. and Donnelly, P. (2001). "A new statistical method for haplotype reconstruction." American journal of human genetics. 68: 978-989.
    Niu, T., Qin, Z. Xu, X. and Liu, J. (2002). "Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms." American journal of human genetics. 70: 157-169.
    Gusfield, D. (2002). "Haplotype as perfect phylogeny: conceptual framework and efficient solution." Proceedings of the Sixth Annual International Conference on Computational Biology (RECOMB’02), pp: 166-175.
    Bafna, V., Gusfield, D. Lancia, G. and Yooseph, S. (2002). "Haplotyping as perfect phylogeny: a direct approach." Technical Report UCDavis CSE-2002-21.
    Fellows, M. R., Hartman, T., Hermelin, D., Landau, G. M., Rosamond, F. and Rozenberg, L. (2011). "Haplotype Inference Constrained by Plausible Haplotype Data." IEEE computer Society 8 (6): 1692-1699.
    Drysdale, C. M., McGraw, D. W., Stack, C. B., Stephens, J. C., Judson, R. S., Nandabalan, K., Arnold, K., Ruano, G. and Liggett, S. B. (2000). "Complex promoter and coding region b2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness." Proceedings of the National Academy of Sciences 97 (19): 10483-10488.
    Li, Z., Zhou, W., Zhang, X. and Chen, L. (2005). "A parsimonious tree-grow method for haplotype inference." Bioinformatics 21 (17): 3475-3481.
    Zhenqiu, L. and Shili L. (2005). "Multilocus LD Measure and Tagging SNP Selection with Generalized Mutual Information." Genetic Epidemiology 29: 353-364.
    Liang, H. and Hua, Y. (2010). "An Efficient Tagging SNP Selection Method Using Normalized Mutual Information and Joint Entropy." Intelligent Systems and Applications, 2010 2nd International Workshop: 1-4.
    Stram, D., Haiman, C. A., Hirchhorn, J. N., Altshuler, D., Kolonel, L. N., Henderson, B. E. and Pike, M. C. (2003). "Choosing Haplotype-Tagging SNPs Based on Unphased Genotype Data Using a Preliminary Sample of Unrelated Subjects with an Example from the Multiethnic Cohort Study." Human Heredity 55: 27-36.
    Wang, L. and Xu. Y. (2003). "Haplotype inference by maximum parsimonious." Bioinformatics 19 (14): 1773-1780.
    Nickerson, D. A., Taylor, S. L., Fullerton, S. M., Weiss, K. M., Clark, A. G. Stengard, J. H., Salomaa, V., Boerwinkle, E. and Sing, C. F. (2000). "Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene." Genome Research 10: 1532-1545.
    Niu, T. (2004). "Algorithms for inferring haplotypes." Genetic Epidemiology 27 (4): 334-347.
    Lakshminarasimhan, P. Marmelstein, R., Devito, M., Dongsheng, C. and Qi, L. (2010). "A maximum likelihood based genetic algorithm for inferring haplotypes from genotypes." Education Technology and Computer (ICETC), 2010 2nd International Conference (5): 92-96.
    Rohde, K. and Fuerst, R. (2001). "Haplotyping and Estimation of Haplotype Frequencies for closely Linked Biallelic Multilocus Genetic Phenotypes Including Nuclear Family Information.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE