研究生: |
張君天 Chang, Chun-Tien |
---|---|
論文名稱: |
透過分析雜合去氧核醣核酸圖譜偵測基因體變異序列的混合序列閱讀程式 Mixed Sequence Reader (MSR) program for analyzing DNA sequences with heterozygous base calling chromatography to detect genomic variations |
指導教授: |
唐傳義
Tang, Chuan Yi |
口試委員: |
唐傳義
Tang, Chuan Yi 謝文萍 Hsieh, Wen-Ping 韓永楷 Hon, Wing-Kai 李御賢 Lee, Yun-Shien 蔡七女 Tsai, Chi-Neu 林俊淵 Lin, Chun-Yuan 劉明麗 Liou, Ming-Li |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 英文 |
論文頁數: | 46 |
中文關鍵詞: | 去氧核醣核酸圖譜 、混合序列 、基因體變異 |
外文關鍵詞: | DNA chromatography, heterozygous base calling, genomic variation |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
當逆轉錄聚合酶鏈式反應的產物是單核苷酸多態性,插入刪除序列,短串連重復序列和旁系同源基因等,直接被定序時會得到雜合的螢光圖譜。插入刪除序列和短串列重復序列可以很容易的被偵測出來而且不需要參考序列資料庫以目前的軟體如Indelligent或ShiftDetector.然而基因變異的檢測仍然式一個挑戰,由於缺乏適合的工具來分析雜合的螢光圖譜數據.在這項研究中,我們開發了一套免費的網頁工具「混合序列閱讀器」可以直接分析ABI檔案格式的雜合螢光圖譜數據。兩個雜合的序列可以透過比對參考的序列資料庫而被確認並且分離開來,在我們的研究結果中顯示出,混合序列閱讀器可以用於下列情況:(一)判別插入刪除序列和短串列重復序列在參考序列中的的實際物理位置並計算出短串列重復序列的重複次數(二)以美國聯邦調查局的合併核醣核甘酸索引系統預測核醣核甘酸的微型衛星組合型態(三)利用目前已知的人類乳凸病毒資料庫判別複合型病毒感染的病毒型態(四)預估旁系同源基因的拷貝數例如β-defensin 4, DEFB4和他的同源基因。
When PCR products are directly sequenced, heterozygous base-calling fluorescence chromatogram data are derived for identifying single nucleotide polymorphisms (SNP), insertion-deletion (Indel), short tandem repeat (STR), and paralogous genes. Indel and STR can be easily detected using the currently available Indelligent or ShiftDetector programs without searching reference sequences. However, the detection of other genomic variants remains a challenge because of the lack of appropriate tools to analyze heterozygous base-calling fluorescence chromatogram data. In this study, we developed the free, web-based “Mixed Sequence Reader (MSR)” that can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format with reference sequences. The heterozygous sequences can be identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used for: (i) physically locating Indel and STR sequences by searching the NCBI reference sequences, and determining the copy number of STR, (ii) predicting the combinations of microsatellite pattern using Federal Bureau of Investigation Combined DNA Index System (CODIS), (iii) determining human papilloma virus (HPV) genotypes by searching current viral databases in cases of multiple infections, and (iv) estimating the copy number of paralogous genes, such as β-defensin 4, DEFB4, and its paralog HSPDP3
1. Janssens, A.C. and C.M. van Duijn, Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet, 2008. 17(R2): p. R166-73.
2. Manolio, T.A., Genomewide association studies and assessment of the risk of disease. N Engl J Med, 2010. 363(2): p. 166-76.
3. Menashe, I., et al., Pathway analysis of breast cancer genome-wide association study highlights three pathways and one canonical signaling cascade. Cancer Res, 2010. 70(11): p. 4453-9.
4. Wacholder, S., et al., Performance of common genetic variants in breast-cancer risk models. N Engl J Med, 2010. 362(11): p. 986-93.
5. Conrad, D.F., et al., Origins and functional impact of copy number variation in the human genome. Nature, 2010. 464(7289): p. 704-12.
6. Nakamura, Y., DNA variations in human and medical genetics: 25 years of my experience. J Hum Genet, 2009. 54(1): p. 1-8.
7. Levy, S., et al., The diploid genome sequence of an individual human. PLoS Biol, 2007. 5(10): p. e254.
8. Seroussi, E., M. Ron, and D. Kedra, ShiftDetector: detection of shift mutations. Bioinformatics, 2002. 18(8): p. 1137-8.
9. Dmitriev, D.A. and R.A. Rakitov, Decoding of superimposed traces produced by direct sequencing of heterozygous indels. PLoS Comput Biol, 2008. 4(7): p. e1000113.
10. Dmitriev, D.A. and R.A. Rakitov. Indelligent v.1.2. . 2008; Available from: http://ctap.inhs.uiuc.edu/dmitriev/indel.asp.
11. Zhidkov, I., et al., CHILD: a new tool for detecting low-abundance insertions and deletions in standard sequence traces. Nucleic Acids Res, 2011. 39(7): p. e47.
12. Budowle, B., et al., STR primer concordance study. Forensic Science International, 2001. 124(1): p. 47-54.
13. Cotton, E.A., et al., Validation of the AMPFlSTR® SGM Plus™ system for use in forensic casework. Forensic Science International, 2000. 112(2-3): p. 151-161.
14. Butler, J.M., et al., Forensic DNA typing by capillary electrophoresis using the ABI Prism 310 and 3100 genetic analyzers for STR analysis. Electrophoresis, 2004. 25(10-11): p. 1397-412.
15. Butler, J.M., Genetics and Genomics of Core Short Tandem Repeat Loci Used in Human Identity Testing. Journal of Forensic Sciences, 2006. 51(2): p. 253-265.
16. Butler, J.M., Short tandem repeat typing technologies used in human identity testing. Biotechniques, 2007. 43(4): p. ii-v.
17. THE_FEDERAL_BUREAU_OF_INVESTIGATION. Combined DNA Index System (CODIS). Available from: http://www.fbi.gov/about-us/lab/codis/codis.
18. Divne, A.-M., H. Edlund, and M. Allen, Forensic analysis of autosomal STR markers using Pyrosequencing. Forensic Science International: Genetics, 2010. 4(2): p. 122-129.
19. Lin, C.-Y., et al., Human papillomavirus typing with a polymerase chain reaction-based genotyping array compared with type-specific PCR. Journal of Clinical Virology, 2008. 42(4): p. 361-367.
20. Cook, E.H., Jr. and S.W. Scherer, Copy-number variations associated with neuropsychiatric conditions. Nature, 2008. 455(7215): p. 919-23.
21. Gersemann, M., et al., Crohn's disease--defect in innate defence. World J Gastroenterol, 2008. 14(36): p. 5499-503.
22. Groth, M., et al., Both copy number and sequence variations affect expression of human DEFB4. Genes Immun, 2010.
23. Aldred, P.M., E.J. Hollox, and J.A. Armour, Copy number polymorphism and expression level variation of the human alpha-defensin genes DEFA1 and DEFA3. Hum Mol Genet, 2005. 14(14): p. 2045-52.
24. Hollox, E.J., et al., Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet, 2008. 40(1): p. 23-5.
25. Groth, M., et al., High-resolution mapping of the 8p23.1 beta-defensin cluster reveals strictly concordant copy number variation of all genes. Hum Mutat, 2008. 29(10): p. 1247-54.
26. Schouten, J.P., et al., Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res, 2002. 30(12): p. e57.
27. Armour, J.A., et al., Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats. Nucleic Acids Res, 2007. 35(3): p. e19.
28. Deutsch, S., et al., Detection of aneuploidies by paralogous sequence quantification. J Med Genet, 2004. 41(12): p. 908-15.
29. Linzmeier, R.M. and T. Ganz, Copy number polymorphisms are not a common feature of innate immune genes. Genomics, 2006. 88(1): p. 122-6.
30. Townson, J.R., L.F. Barcellos, and R.J. Nibbs, Gene copy number regulates the production of the human chemokine CCL3-L1. Eur J Immunol, 2002. 32(10): p. 3016-26.
31. Gonzalez, E., et al., The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science, 2005. 307(5714): p. 1434-40.
32. Perry, G.H., The evolutionary significance of copy number variation in the human genome. Cytogenet Genome Res, 2008. 123(1-4): p. 283-7.
33. Ngamphiw, C., et al., VarDetect: a nucleotide sequence variation exploratory tool. BMC Bioinformatics, 2008. 9 Suppl 12: p. S9.
34. Lai, C.H., et al., Role of human papillomavirus genotype in prognosis of early-stage cervical cancer undergoing primary surgery. J Clin Oncol, 2007. 25(24): p. 3628-34.
35. Yang, S.W., et al., Human papillomavirus in oral leukoplakia is no prognostic indicator of malignant transformation. Cancer Epidemiol, 2009. 33(2): p. 118-22.
36. Duffy, S., L.A. Shackelton, and E.C. Holmes, Rates of evolutionary change in viruses: patterns and determinants. Nat Rev Genet, 2008. 9(4): p. 267-76.
37. Lee, Y.S., et al., CGcgh: a tool for molecular karyotyping using DNA microarray-based comparative genomic hybridization (array-CGH). J Biomed Sci, 2008. 15(6): p. 687-96.
38. Chen, K., et al., PolyScan: an automatic indel and SNP detection approach to the analysis of human resequencing data. Genome Res, 2007. 17(5): p. 659-66.