研究生: |
林威志 Wei-Chih Lin |
---|---|
論文名稱: |
以演化腳印分析及預測人類假設性基因的調控序列 Predicting Novel Gene Regulatory Motifs based on Hypothetical Genes in Human Genome Using Phylogenetic Footprinting |
指導教授: |
蘇豐文
Von-Wun Soo |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 英文 |
論文頁數: | 70 |
中文關鍵詞: | 演化腳印 、功能區域搜尋 、假設性基因 、轉錄因子 、轉錄因子結合位 |
外文關鍵詞: | phylogenetic footprinting, motif finding, hypothetical gene, regulatory element, transcription factor binding site |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
功能基因體學專注於基因功能上的註解,並對基因網路有廣泛性的了解,而基因網路組成複雜的生物功能
上的任務,很多的研究仍然致力於在解決這樣的問題,也有很多的研究致力於演算法上準確度的提升,但
是,人類基因體解序後仍然有將近四分之一的基因是功能未知的,而且被標註是假設性基因。
而先前研究指出,基因若有類似的功能通常會有類似的轉錄調控機制,也就是說,在他們的啟動子上會有
類似的轉錄因子結合部位,受到演化壓力下,這些具有功能的因子在演化過程中會比附近沒有功能的區域
要來的慢,因此,在電腦輔助的研究中,多物種的研究方向所找到的轉錄因子結合部位會是很有意義的結
果,這個方法也在很多基因中成功的找到新的轉錄因子結合部位,這個方法就是所謂的「演化腳印分析法
」。此論文用此方法,從起動子的取得、預測直到轉錄因子結合部位的搜尋,而後,再用假設性為材料將
癌症相關基因一同加入搜尋,並比較及搜尋這些基因與癌症基因可能的共同調控因子。
我分析人類的假設性基因具有與異種同源的基因的啟動子部分,並且提供一個網路服務給生物學家可以很
容易的用來分析不同物種間的基因關係。最後,這樣的研究順利找到了這些假設性基因與癌症相關基因共
同的高度保留部位,這結果是很有意義的,並且從分類上區分這些高度保留的調控部位。
Functional genomics focuses on assigning genes into functional categories and providing a comprehensive understanding of genetic networks. Genetic networks are complicated to perform complex biological tasks. Lots of works are still working on deciphering it and forcing to higher accuracy of algorithm. But there are still about one-fourth of genes in human genome functionally indistinct and are annotated to hypothetical genes. Genes involve in the same biological process are often regulated by similar transcriptional mechanism and are likely to contain similar transcription factor binding sites (TFBS) in their proximal promoters. Functional elements tend to evolve much slower than non-functional region, as they are subjected to selective pressure. Multi-species approach is come to make sense of TFBS prediction in silico, and was used with success to identify regulatory elements in various genes. The method is so-called “Phylogenetic Footprinting’’. The work flow goes through promoter extraction, prediction, and regulatory elements detection. Thus, both hypothetical genes and cancer-related genes are the inputs for testing.
In this thesis, I analyzed the promoter region of hypothetical genes of Homo sapiens which are homologous to other organism, and provide a web service for biologists to analyze genetic networks between different organisms easily. Finally, the results are interesting because of discovering several conserved elements within some hypothetical genes and cancer-related genes, and supplying the highly conserved regulatory elements from different taxonomic nodes.
[1] E. S. Lander, L. M. Linton, B. Birren, et al., "Initial sequencing and analysis of the human genome," Nature, vol. 409, pp. 860-921, 2001.
[2] J. C. Venter, M. D. Adams, E. W. Myers, et al., "The sequence of the human genome," Science, vol. 291, pp. 1304-51, 2001.
[3] N. A. Faustino and T. A. Cooper, "Pre-mRNA splicing and human disease," Genes Dev, vol. 17, pp. 419-37, 2003.
[4] J. P. Venables, "Aberrant and alternative splicing in cancer," Cancer Res, vol. 64, pp. 7647-54, 2004.
[5] M. Gale, Jr., S. L. Tan, and M. G. Katze, "Translational control of viral gene expression in eukaryotes," Microbiol Mol Biol Rev, vol. 64, pp. 239-80, 2000.
[6] O. G. Berg and P. H. von Hippel, "Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters," J Mol Biol, vol. 193, pp. 723-50, 1987.
[7] O. G. Berg and P. H. von Hippel, "Selection of DNA binding sites by regulatory proteins. II. The binding specificity of cyclic AMP receptor protein to recognition sites," J Mol Biol, vol. 200, pp. 709-23, 1988.
[8] V. B. Bajic, S. H. Seah, A. Chong, et al., "Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters," Bioinformatics, vol. 18, pp. 198-9, 2002.
[9] S. T. Smale, "Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes," Biochim Biophys Acta, vol. 1351, pp. 73-88, 1997.
[10] S. T. Smale, "Core promoter architecture for eukaryotic protein-coding genes. In transcription: mechanisms and regulation.," R. C. Conaway and J. W. Conaway, Eds. New York: Raven Press, 1994, pp. 63-81.
[11] C. D. Novina and A. L. Roy, "Core promoters and transcriptional control," Trends Genet, vol. 12, pp. 351-5, 1996.
[12] T. W. Burke and J. T. Kadonaga, "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila," Genes Dev, vol. 11, pp. 3020-31, 1997.
[13] B. Alberts, Molecular Biology of the Cell, 4th ed. New York: Garland, 2002.
[14] M. Carey and S. T. Smale, Transcriptional Regulation in Eukaryotes: Concepts, Strategies, and Techniques, 1st ed. New York: Cold Spring Harbor Laboratory Press, 2000.
[15] M. Hampsey, "Molecular genetics of the RNA polymerase II general transcriptional machinery," Microbiol Mol Biol Rev, vol. 62, pp. 465-503, 1998.
[16] T. I. Lee and R. A. Young, "Regulation of gene expression by TBP-associated proteins," Genes Dev, vol. 12, pp. 1398-408, 1998.
[17] B. Lemon and R. Tjian, "Orchestrated response: a symphony of transcription factors for gene control," Genes Dev, vol. 14, pp. 2551-69, 2000.
[18] P. Ernst and S. T. Smale, "Combinatorial regulation of transcription II: The immunoglobulin mu heavy chain gene," Immunity, vol. 2, pp. 427-38, 1995.
[19] R. Grosschedl, K. Giese, and J. Pagel, "HMG domain proteins: architectural elements in the assembly of nucleoprotein structures," Trends Genet, vol. 10, pp. 94-100, 1994.
[20] R. H. Waterston, K. Lindblad-Toh, E. Birney, et al., "Initial sequencing and comparative analysis of the mouse genome," Nature, vol. 420, pp. 520-62, 2002.
[21] "Initial sequence of the chimpanzee genome and comparison with the human genome," Nature, vol. 437, pp. 69-87, 2005.
[22] E. T. Dermitzakis, A. Reymond, N. Scamuffa, et al., "Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs)," Science, vol. 302, pp. 1033-5, 2003.
[23] T. Murakami, M. Najima, M. Ogawa, et al., "HAPPY: Hypothetical and Putative Protein Database System," Genome Informatics vol. 14, pp. 651-652 2003.
[24] A. G. Pedersen, P. Baldi, Y. Chauvin, et al., "The biology of eukaryotic promoter prediction--a review," Comput Chem, vol. 23, pp. 191-207, 1999.
[25] V. B. Bajic, S. L. Tan, Y. Suzuki, et al., "Promoter prediction analysis on the whole human genome," Nat Biotechnol, vol. 22, pp. 1467-73, 2004.
[26] R. V. Davuluri, I. Grosse, and M. Q. Zhang, "Computational identification of promoters and first exons in the human genome," Nat Genet, vol. 29, pp. 412-7, 2001.
[27] D. L. Wheeler, T. Barrett, D. A. Benson, et al., "Database resources of the National Center for Biotechnology Information," Nucleic Acids Res, vol. 34, pp. D173-80, 2006.
[28] G. D. Schuler, "Pieces of the puzzle: expressed sequence tags and the catalog of human genes," J Mol Med, vol. 75, pp. 694-8, 1997.
[29] R. V. Davuluri, Y. Suzuki, S. Sugano, et al., "CART classification of human 5' UTR sequences," Genome Res, vol. 10, pp. 1807-16, 2000.
[30] E. Wingender, X. Chen, R. Hehl, et al., "TRANSFAC: an integrated system for gene expression regulation," Nucleic Acids Res, vol. 28, pp. 316-9, 2000.
[31] L. Duret and P. Bucher, "Searching for regulatory elements in human noncoding sequences," Curr Opin Struct Biol, vol. 7, pp. 399-406, 1997.
[32] D. A. Tagle, B. F. Koop, M. Goodman, et al., "Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints," J Mol Biol, vol. 203, pp. 439-55, 1988.
[33] D. L. Gumucio, D. A. Shelton, W. J. Bailey, et al., "Phylogenetic footprinting reveals unexpected complexity in trans factor binding upstream from the epsilon-globin gene," Proc Natl Acad Sci U S A, vol. 90, pp. 6018-22, 1993.
[34] J. F. Manen, V. Savolainen, and P. Simon, "The atpB and rbcL promoters in plastid DNAs of a wide dicot range," J Mol Evol, vol. 38, pp. 577-82, 1994.
[35] S. Vuillaumier, I. Dixmeras, H. Messai, et al., "Cross-species characterization of the promoter region of the cystic fibrosis transmembrane conductance regulator gene reveals multiple levels of regulation," Biochem J, vol. 327 ( Pt 3), pp. 651-62, 1997.
[36] J. Y. Leung, F. E. McKenzie, A. M. Uglialoro, et al., "Identification of phylogenetic footprints in primate tumor necrosis factor-alpha promoters," Proc Natl Acad Sci U S A, vol. 97, pp. 6614-8, 2000.
[37] G. G. Loots, R. M. Locksley, C. M. Blankespoor, et al., "Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons," Science, vol. 288, pp. 136-40, 2000.
[38] M. Blanchette and M. Tompa, "FootPrinter: A program designed for phylogenetic footprinting," Nucleic Acids Res, vol. 31, pp. 3840-2, 2003.
[39] D. F. Feng and R. F. Doolittle, "Progressive sequence alignment as a prerequisite to correct phylogenetic trees," J Mol Evol, vol. 25, pp. 351-60, 1987.