研究生: |
李季青 Chi-Ching Lee |
---|---|
論文名稱: |
利用DNA與蛋白質探針來建構全基因體及蛋白質體樹 Construction of whole genomic and proteomic trees based on DNA and Protein probes |
指導教授: |
呂平江
Ping-Chiang Lyu |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
生命科學暨醫學院 - 生物資訊與結構生物研究所 Institute of Bioinformatics and Structural Biology |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 英文 |
論文頁數: | 66 |
中文關鍵詞: | 比較基因體 、生物資訊 、微生物基因體 、全基因體比較 、基因體樹 、蛋白質體樹 |
外文關鍵詞: | Comparative genome, Bioinformatics, Microbial genome, Whole genome comparison, Genomic tree, Proteomic tree |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
微生物在型態及生活環境方面有極大差異,使得系統分類與演化關係之研究難有一致結論。1970年代以來,人們著手建立以分子演化為基礎的微生物分類體系,試圖利用某些穩定且具有共通演化特徵的生物標記 (biomarker) 訂出微生物的演化關係。例如,小核醣體RNA (SSU RNA)序列相似度分析是最早被應用於研究原核生物演化關係的生物標記,至今依然被普遍採用。然而,只依靠少量的生物標記來推斷全體物種的演化關係已被認為有其不足。2000年以後,基因體定序技術漸趨成熟,越來越多微生物被定序完成,因此,開始有科學家由全基因體 (whole genome) 的角度來探討物種間的親緣關係。
我們建立了一套基於全基因體與蛋白質體的分群方法來分類微生物並據以分析微生物在生物演化上的位階與重要性。我們利用一些具有生物意義的氨基酸與核酸典型序列 (pattern) 片段來解析基因體與蛋白質體,氨基酸典型序列是取自Prosite資料庫;核酸典型序列是採用限制酵素 (restriction enzyme) 之辨識序列,資料來自REBASE (the Rstriction Enzyme dataBASE)。這些典型序列在全基因體與蛋白質體中出現的機率經過統計後,再以unsupervised clustering方法分析結果。
結果顯示,我們的基因體樹能把GC含率 (GC contents) 相似的微生物分群在一起。此外,以 Prosite pattern 做分群的結果能夠將古細菌 (archaea)、真細菌(bacteria) 與真菌 (fungi) 分成二群,後兩者在同一群。這個蛋白質體樹的底層和傳統分類結果相似,而較末稍的分支則更適切地將生化代謝表現型相似的微生物分群在一起,例如寄生型細菌、嗜熱細菌、產甲烷菌以及光合作用細菌等。這套分群與分析比對方法,我們已透過PHP語言、MySQL資料庫與圖形化資料呈現技術,建置了一個線上服務,網址為:http://probac.life.nthu.edu.tw/。
The classification of microorganisms is difficult because they have various morphological and environmental distributing properties. Since 1970, taxonomy systems have been developed based on some stable and standard molecular biomarkers; for instance, sequence similarity of SSU RNA (small subunit ribosomal RNA) is the first and still wildly used biomarker nowadays for prokaryotes. However, it has been reported insufficient to classify all kinds of organisms by using one or only a few biomarkers. After 2000, the development of genome sequencing techniques has been so rapid that it is now possible to analyze the evolutionary relationships of organisms on the scale of whole genomes.
We have developed a probe-based genome/proteome clustering approach based on the frequency of biologically meaningful restriction enzyme recognition elements and protein signatures. Such elements and signatures are provided by REBASE, the Restriction Enzyme dataBASE, and Prosite database, a collection of annotated motif descriptors from protein families and domains, We compared bacteria, archaea and fungi to build the genomic and proteomic tree by an unsupervised clustering method.
Our results showed that, the genomic tree grouped together microorganisms with similar GC contents, and the proteomic tree clustered bacteria, archaea and fungi into two branches, where the latter two share the same node. Furthermore, the tree built based on Prosite signatures agreed well with the traditional phylogeny at the basal branches while the distal classifications seemed to reflect phenotypic features, such as the parasitism, thermophilicity, capabilities of methanogenesis or photosynthesis, better than traditional SSU RNA-based classifications. A web service has been set up, which is available at: http://probac.life.nthu.edu.tw/.
Bairoch, A. 1991. PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res 19 Suppl: 2241-2245.
Bult, C.J., O. White, G.J. Olsen, L. Zhou, R.D. Fleischmann, G.G. Sutton, J.A. Blake, L.M. FitzGerald, R.A. Clayton, J.D. Gocayne et al. 1996. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273: 1058-1073.
Colwell, R.R. 1970. Polyphasic taxonomy of the genus vibrio: numerical taxonomy of Vibrio cholerae, Vibrio parahaemolyticus, and related Vibrio species. J Bacteriol 104: 410-433.
de Hoon, M.J., S. Imoto, J. Nolan, and S. Miyano. 2004. Open source clustering software. Bioinformatics 20: 1453-1454.
Deckert, G., P.V. Warren, T. Gaasterland, W.G. Young, A.L. Lenox, D.E. Graham, R. Overbeek, M.A. Snead, M. Keller, M. Aujay et al. 1998. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature 392: 353-358.
Dicks, L.M.T., and H. J. J. van Vuuren, and F. Dellaglio. 1987. Relatedness of homofermentative Lactobacillus species revealed by numerical analysis of total soluble cell protein patterns. Int. J. Syst. Bacteriol 37: 437-440.
Doolittle, R.F. 1995. The origins and evolution of eukaryotic proteins. Philos Trans R Soc Lond B Biol Sci 349: 235-240.
Dutilh, B.E., M.A. Huynen, W.J. Bruno, and B. Snel. 2004. The consistent phylogenetic signal in genome trees revealed by reducing the impact of noise. J Mol Evol 58: 527-539.
Edwards, R. and R.G. Finch. 1986. Characterisation and antibiotic susceptibilities of Streptobacillus moniliformis. J Med Microbiol 21: 39-42.
Ewing, W.H. 1962. Sources of Escherichia coli cultures that belonged to O antigen groups associated with infantile diarrheal disease. J Infect Dis 110: 114-120.
Fox, G.E., J.D. Wisotzkey, and P. Jurtshuk, Jr. 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity. Int J Syst Bacteriol 42: 166-170.
Garrity, G.M. 2005. Bergey's Manual of Systematic Bacteriology.
Grimont, P.A., F. Grimont, N. Desplaces, and P. Tchen. 1985. DNA probe specific for Legionella pneumophila. J Clin Microbiol 21: 431-437.
Holmes, B., S.P. Lapage, and H. Malnick. 1975. Strains of Pseudomonas putrefaciens from clinical material. J Clin Pathol 28: 149-155.
Hulo, N., A. Bairoch, V. Bulliard, L. Cerutti, E. De Castro, P.S. Langendijk-Genevaux, M. Pagni, and C.J. Sigrist. 2006. The PROSITE database. Nucleic Acids Res 34: D227-230.
Ibba, M., J.L. Bono, P.A. Rosa, and D. Soll. 1997. Archaeal-type lysyl-tRNA synthetase in the Lyme disease spirochete Borrelia burgdorferi. Proc Natl Acad Sci U S A 94: 14383-14388.
Ibba, M., H.C. Losey, Y. Kawarabayasi, H. Kikuchi, S. Bunjun, and D. Soll. 1999. Substrate recognition by class I lysyl-tRNA synthetases: a molecular basis for gene displacement. Proc Natl Acad Sci U S A 96: 418-423.
Jain, R., M.C. Rivera, and J.A. Lake. 1999. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A 96: 3801-3806.
Johnson, M.A., J.M. Whalley, I.R. Littlejohns, J. Dickson, V.W. Smith, C.R. Wilks, and A.H. Reisner. 1985. Macropodid herpesviruses 1 and 2: two herpesviruses from Australian marsupials differentiated by restriction endonucleases, DNA composition and hybridization. Brief report. Arch Virol 85: 313-319.
Kim, J.S.A.S.Y.L. 2006. Genomic Tree of Gene Contents Based on Functional Groups of KEGG J. Microvial. Biotechnol 16: 748-756.
Korbel, J.O., B. Snel, M.A. Huynen, and P. Bork. 2002. SHOT: a web server for the construction of genome phylogenies. Trends Genet 18: 158-162.
Lake, J.A., R. Jain, and M.C. Rivera. 1999. Mix and match in the tree of life. Science 283: 2027-2028.
Lin, J. and M. Gerstein. 2000. Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. Genome Res 10: 808-818.
Liolios, K., N. Tavernarakis, P. Hugenholtz, and N.C. Kyrpides. 2006. The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Res 34: D332-334.
Ludwin, B. 1998. A look at umbilical cord blood. Nurs Spectr (Gt Chic Ne Ill Nw Indiana Ed) 11: 24.
Ludwin, D., I. Alexopoulou, J.M. Esdaile, and P. Tugwell. 1994. Renal biopsy specimens from patients with rheumatoid arthritis and apparently normal renal function after therapy with cyclosporine. Canadian Multicentre Rheumatology Group. Am J Kidney Dis 23: 260-265.
Marmur, J. and P. Doty. 1961. Thermal renaturation of deoxyribonucleic acids. J Mol Biol 3: 585-594.
Mira, A., R. Pushker, B.A. Legault, D. Moreira, and F. Rodriguez-Valera. 2004. Evolutionary relationships of Fusobacterium nucleatum based on phylogenetic analysis and comparative genomics. BMC Evol Biol 4: 50.
Murry, P.R. 1997. Medical Microbiology.
Nomura, T., K. Yasuda, T. Yamada, S. Okamoto, R.I. Mahato, Y. Watanabe, Y. Takakura, and M. Hashida. 1999. Gene expression and antitumor effects following direct interferon (IFN)-gamma gene transfer with naked plasmid DNA and DC-chol liposome complexes in mice. Gene Ther 6: 121-129.
Olsen, G.J. and C.R. Woese. 1993. Ribosomal RNA: a key to phylogeny. Faseb J 7: 113-123.
Olsen, G.J., C.R. Woese, and R. Overbeek. 1994. The winds of (evolutionary) change: breathing new life into microbiology. J Bacteriol 176: 1-6.
Pruitt, K.D., T. Tatusova, and D.R. Maglott. 2005. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33: D501-504.
Pruitt, K.D., T. Tatusova, and D.R. Maglott. 2007. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35: D61-65.
Qi, J., B. Wang, and B.I. Hao. 2004. Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. J Mol Evol 58: 1-11.
Roberts, R.J. and D. Macelis. 1993. REBASE--restriction enzymes and methylases. Nucleic Acids Res 21: 3125-3137.
Roberts, R.J., T. Vincze, J. Posfai, and D. Macelis. 2007. REBASE--enzymes and genes for DNA restriction and modification. Nucleic Acids Res 35: D269-270.
Sigrist, C.J., L. Cerutti, N. Hulo, A. Gattiker, L. Falquet, M. Pagni, A. Bairoch, and P. Bucher. 2002. PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3: 265-274.
Sneath, P.H. 1992. Correction of orthography of epithets in Pasteurella and some problems with recommendations on latinization. Int J Syst Bacteriol 42: 658-659.
Snel, B., M.A. Huynen, and B.E. Dutilh. 2005. Genome trees and the nature of genome evolution. Annu Rev Microbiol 59: 191-209.
Stackebrandt, E., W. Ludwig, M. Weizenegger, S. Dorn, T.J. McGill, G.E. Fox, C.R. Woese, W. Schubert, and K.H. Schleifer. 1987. Comparative 16S rRNA oligonucleotide analyses and murein types of round-spore-forming bacilli and non-spore-forming relatives. J Gen Microbiol 133: 2523-2529.
Teichmann, S.A. and G. Mitchison. 1999. Is there a phylogenetic signal in prokaryote proteins? J Mol Evol 49: 98-107.
Tekaia, F. and E. Yeramian. 2005. Genome trees from conservation profiles. PLoS Comput Biol 1: e75.
Tomb, J.F., O. White, A.R. Kerlavage, R.A. Clayton, G.G. Sutton, R.D. Fleischmann, K.A. Ketchum, H.P. Klenk, S. Gill, B.A. Dougherty et al. 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388: 539-547.
Vandamme, P., B. Pot, M. Gillis, P. de Vos, K. Kersters, and J. Swings. 1996. Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 60: 407-438.
Woese, C.R. 1987. Bacterial evolution. Microbiol Rev 51: 221-271.
Woese, C.R., O. Kandler, and M.L. Wheelis. 1990. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A 87: 4576-4579.
Wolf, Y.I., I.B. Rogozin, N.V. Grishin, and E.V. Koonin. 2002. Genome trees and the tree of life. Trends Genet 18: 472-479.
Wolf, Y.I., I.B. Rogozin, N.V. Grishin, R.L. Tatusov, and E.V. Koonin. 2001. Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol 1: 8.
Yang, Z. 2005. The power of phylogenetic comparison in revealing protein function. Proc Natl Acad Sci U S A 102: 3179-3180.
Yap, W.H., Z. Zhang, and Y. Wang. 1999. Distinct types of rRNA operons exist in the genome of the actinomycete Thermomonospora chromogena and evidence for horizontal transfer of an entire rRNA operon. J Bacteriol 181: 5201-5209.