研究生: |
吳俊穎 Wu, Chun Ying |
---|---|
論文名稱: |
利用標準化互信息與樹成長方法預測單倍體 Normalized Mutual Information and Tree-growing Methods for Haplotype Inference |
指導教授: |
蘇豐文
Soo, Von Wun |
口試委員: |
陳宜欣
Chen, Yi Shin 陳煥宗 Chen, Hwann Tzong |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 32 |
中文關鍵詞: | 單倍體預測 |
外文關鍵詞: | haplotype inference |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在人類基因組計劃揭開基因序列後,單倍體是一種對於遺傳疾病研究、藥物目標預測研究上都非常具有助益的資訊,但要用臨床實驗的方法得到單倍體的資訊是非常耗時間而且在經濟上也是一大考驗,所以近年來越來越多的研究是想要從電腦模擬推斷出單倍體的資訊,希望藉由這種方式去加快得到單倍體的資訊。在這篇論文裡,推斷正確單倍體資訊的問題中,我們提出藉由導入多個單核苷酸多態性之間的相互關聯性資訊到成長樹的解決方式裡,經過這樣的方法,我們確實在推斷準確率上得到相較於簡約樹成長法最高至2.62%的成長,且在推斷正確單倍體的問題上,最高的推斷準確率達到95.2%。藉由高準確率的推斷單倍體後,我們希望可以跳過臨床實驗去取得單倍體資訊的步驟,直接拿來幫助其他相關研究。
Haplotypes are a kind of powerful information that is helpful in gene candidate studies because of inheritance characteristics. However, in order to get the haplotype information, in vitro methods cost lots of time and money, it is helpful to infer haplotypes using in Silico methods. Because the haplotype inference is a NP-Hard problem, both the accuracy and computational time are important issues. In this thesis, we take into account of the normalized mutual information in the parsimonious tree-grow methods that show very good performance on haplotype inference problems. And we improve the inference accuracy rate most to 2.62 percent on APOE gene dataset which just spend about 0.001 more seconds than original parsimonious tree-grow method. We also have the highest to 95.2% accuracy rate on β2AR gene data in comparison to previous approaches.
O’Brien, S. J., and Nelson G. W. (2004). "Human genes that limit AIDS." Nature Genetics 36 (6): 565-574
Jablonski NG, Chaplin G. (2000). "The evolution of human skin coloration." Journal of Human Genetics. 39: 57-106
Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, Liu Q, Cochran C, Bennett LM and Ding W. (1994). "A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1." Science 266: 66-71.
Shastry, B. S. (2002). "SNP alleles in human disease and evolution." Journal of Human Genetics. 47: 561-566.
Bullido, M. J. et al. (1998). "A polymorphism in the regulatory region of APOE associated with risk for Alzheimer’s dementia." Nature Genetics. 18: 69-71.
Clark, A. G. (1990). "Inference of haplotypes from PCR-amplified samples of diploid populations." Molecular Biology and Evolution. 7: 111-122.
Excoffier, L. and Slatkin, M. (1995). "Maximum-Likelihood Estimation of molecular haplotype frequencies in a diploid population." Molecular Biology and Evolution. 12: 921-927.
Hawley, M. and Kidd, K. (1995). "Haplo: A program using the EM algorithm to estimate the frequencies of multi-site haplotypes." Journal of Heredity. 86: 409-411.
Stephens, M. Smith, N. and Donnelly, P. (2001). "A new statistical method for haplotype reconstruction." American journal of human genetics. 68: 978-989.
Niu, T., Qin, Z. Xu, X. and Liu, J. (2002). "Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms." American journal of human genetics. 70: 157-169.
Gusfield, D. (2002). "Haplotype as perfect phylogeny: conceptual framework and efficient solution." Proceedings of the Sixth Annual International Conference on Computational Biology (RECOMB’02), pp: 166-175.
Bafna, V., Gusfield, D. Lancia, G. and Yooseph, S. (2002). "Haplotyping as perfect phylogeny: a direct approach." Technical Report UCDavis CSE-2002-21.
Fellows, M. R., Hartman, T., Hermelin, D., Landau, G. M., Rosamond, F. and Rozenberg, L. (2011). "Haplotype Inference Constrained by Plausible Haplotype Data." IEEE computer Society 8 (6): 1692-1699.
Drysdale, C. M., McGraw, D. W., Stack, C. B., Stephens, J. C., Judson, R. S., Nandabalan, K., Arnold, K., Ruano, G. and Liggett, S. B. (2000). "Complex promoter and coding region b2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness." Proceedings of the National Academy of Sciences 97 (19): 10483-10488.
Li, Z., Zhou, W., Zhang, X. and Chen, L. (2005). "A parsimonious tree-grow method for haplotype inference." Bioinformatics 21 (17): 3475-3481.
Zhenqiu, L. and Shili L. (2005). "Multilocus LD Measure and Tagging SNP Selection with Generalized Mutual Information." Genetic Epidemiology 29: 353-364.
Liang, H. and Hua, Y. (2010). "An Efficient Tagging SNP Selection Method Using Normalized Mutual Information and Joint Entropy." Intelligent Systems and Applications, 2010 2nd International Workshop: 1-4.
Stram, D., Haiman, C. A., Hirchhorn, J. N., Altshuler, D., Kolonel, L. N., Henderson, B. E. and Pike, M. C. (2003). "Choosing Haplotype-Tagging SNPs Based on Unphased Genotype Data Using a Preliminary Sample of Unrelated Subjects with an Example from the Multiethnic Cohort Study." Human Heredity 55: 27-36.
Wang, L. and Xu. Y. (2003). "Haplotype inference by maximum parsimonious." Bioinformatics 19 (14): 1773-1780.
Nickerson, D. A., Taylor, S. L., Fullerton, S. M., Weiss, K. M., Clark, A. G. Stengard, J. H., Salomaa, V., Boerwinkle, E. and Sing, C. F. (2000). "Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene." Genome Research 10: 1532-1545.
Niu, T. (2004). "Algorithms for inferring haplotypes." Genetic Epidemiology 27 (4): 334-347.
Lakshminarasimhan, P. Marmelstein, R., Devito, M., Dongsheng, C. and Qi, L. (2010). "A maximum likelihood based genetic algorithm for inferring haplotypes from genotypes." Education Technology and Computer (ICETC), 2010 2nd International Conference (5): 92-96.
Rohde, K. and Fuerst, R. (2001). "Haplotyping and Estimation of Haplotype Frequencies for closely Linked Biallelic Multilocus Genetic Phenotypes Including Nuclear Family Information.