簡易檢索 / 詳目顯示

研究生: 賴思明
Szu-Ming Lai
論文名稱: 微生物全基因體特性資料庫之建立與應用
GPDB(Genome Profile DataBase): Construction and Application of Complete Microbial Genome Analysis Database
指導教授: 呂平江
Ping-Chiang Lyu
口試委員:
學位類別: 碩士
Master
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2004
畢業學年度: 92
語文別: 中文
論文頁數: 120
中文關鍵詞: 比較基因體基因體特性生物資訊微生物基因體全基因體比較虛擬二維電泳
外文關鍵詞: Comparative genome, Genome profile, Bioinformatics, Microbial genome, Whole genome comparison, Virtual 2D gel
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 中文摘要

    隨著愈來愈多的原核生物基因體被定序完,利用完整序列來探討原核生物間的多樣性也變得可行,許多比較基因體學的研究致力於尋找物種間的相似性與差異性。由於原核生物在形態的辦識困難以及生活環境的多樣性,導致在分類及演化的地位上較難有一致的結論,直到以分子演化的方式才得以分析原核生物間親緣關係,特別是16S rRNA的鑑定將現生生物畫分成三個主要的生命形態。而生活在極端環境下的微生物,如嗜高溫菌、嗜酸菌、嗜鹽菌…等,也有研究指出在核酸及胺基酸組成上有所偏好,更有許多研究利用全基因體的特性來重建原核生物的演化地位。因此,我們建立了一個資料庫- GPDB (Genome Profile DataBase),目的是提供生物學家利用全基因體的資訊來探討原核生物的演化與多樣性。目前這個資料庫包括了145株 (strain) 完整定序的原核生物,共有223個染色體與質體,含429177條ORFs。原始的序列及物種的分類資料源自於NCBI的GenBank與Toxonomy資料庫,為了能自動化分析這些資訊,我們以perl語言寫了「Genome Profile Pipeline」用來分析不同的genome profiles,包括核酸組成 (GC & AT content, total GC & AT skew, N-nucleotide frequency, codon usage…)、胺基酸組成 (N-peptide frequency distribution), 蛋白體組成 (length & Mw & pI & transmembrane helix protein & fold…),並以MySQL資料庫作為後端,以圖形化網頁的方式呈現並比較不同生物的genome profiles,並且應用Hierarchical clustering的方式協助類似特徵的歸類,此外,我們也提供虛擬的二維電泳,助蛋白體學上的分析研究。整個網站以模組化的方式設計,以方便日後加入新的genome profiles,提供更多全基因體上的資訊來探討原核生物的多樣性。


    Abstract

    With rapidly generated whole genome sequence data especially those microbial organisms, it can be used to explore the diversity of ancient life. More and more comparative genomic methods have been used to investigate the similarities or dissimilarities between organisms. Phylogenetic tree based on 16S rRNA indicates the prokaryotic evolutionary relationship unrevealed from the morphological characteristics. Other features like GC content and amino acid composition are widely used to account for extreme environmental organisms such as thermophiles, acidophiles, halophiles, etc. Quarrying the whole genome wide information may suggest why microbe diverse. Here we constructed a database GPDB (Genome Profile DataBase) with 145 microbial genomes including bacteria and archaea. The original sequence data and annotations are based on NCBI GeneBank and RefSeq databases. The uniform nomenclature and classification were used according to the taxonomy database at NCBI. In order to automatically process so many features, the program called "Genome Profile Pipeline" has been developed in perl language. Here we present lots of various "Genome Profile", such as basic information (taxonomy, genome size, orf number…), nucleotide composition (GC & AT content, total GC & AT skew, N-nucleotide frequency, codon usage…), and amino acid composition (N-peptide frequency distribution, proteome distribution like length & Mw & pI & transmembrane helix protein & fold…) in graphic ways. In order to estimate different combination interactively, an on-line graphic browsing interface which use Euclidean distance for hierarchical clustering method was built to compare and view the difference between these organisms. Further more, the website is modulated for more Genome Profile to be included and compared in the future.

    目錄 目錄 1 表、圖目錄 3 中文摘要 5 Abstract 6 第一章、緒論 7 1.1前言 7 1.2生物序列定序 (Sequencing) 與基因體註解 (Annotation) 8 1.2.1 Clone-by-clone Shotgun Sequencing 8 1.2.2全基因體霰彈槍式定序法 (Whole Genome Shotgun Sequencing) 8 1.2.3基因體註解 (Genome Annotation) 9 1.3原核生物 (Prokaryote) 9 1.3.1命名、分類、型態特微 9 1.3.2原核生物的分子演化(Molecular Evolution) 10 1.3.3水平基因轉移 (Lateral/Horizontal Gene Transfer, LGT/HGT) 11 1.4全基因體演化(Genome Evolution) 11 1.4.1比較基因體學(Comparative Genomics) 11 1.4.2最小基因體概念 (The Minimal Genome Concept) 12 1.4.3利用全基因體資訊探討原核生物的演化與歧異度 12 1.5研究動機 14 第二章、材料與方法 15 2.1材料 15 2.1.1硬體設備 15 2.1.2原始序列及註解資料 15 2.1.3網站建置相關軟體 17 2.2方法 19 2.2.1取得原始資料 19 2.2.2定義Genome Profile 19 2.2.3相關分析軟體 24 2.2.4 Genome Profile Pipeline介紹 26 2.2.5資料庫設計 28 2.2.6線上階層式叢集分析 (Hierarchical Clustering) 29 2.2.7虛擬二維電泳(Virtual 2D Gel) 33 2.2.8 GPDB建置流程圖 34 第三章、結果與討論 35 3.1瀏覽Genome Profile 35 3.1.1基本資訊(Basic Information) 35 3.1.2核酸組成(Nucleotide Compositon) 36 3.1.3胺基酸組成(Amino Acid Composition) 39 3.1.4 蛋白體分佈(Proteome Distribution) 39 3.2虛擬二維電泳(Virtual 2D Gel) 41 3.3比較Genome Profiles 41 3.4討論特殊的Genome Profiles 44 3.4.1極端等電點分佈 44 3.4.2 探討等電點差異極大的同源蛋白質 47 3.4.3 特殊的AT與GC Skew 48 第四章、結論 50 第五章、未來展望 51 第六章、參考資料 52 附錄(一) GPDB Species Name Abbreviations 110 附錄(二) GPDB Sequences - NCBI Accession Numbers 113 附錄(三) Biological Categories of Organisms 117 表、圖目錄 【表一】14個等電點分佈極端的生物及其生存環境特徵 59 【表二】等電點分佈特殊的蛋白質佔比例大於5%的物種 60 【表三】H. pylori 26695蛋白質與其同源蛋白在pI差異大於3 的列表 61 【圖一】二種主要的定序策略示意圖 62 【圖二】建立BAC的步驟與序列片段的組合 63 【圖三】Clone-by-clone shotgun sequencing的主要流程 64 【圖四】全基因體霰彈槍式定序法的主要流程圖 65 【圖五】以SSU rRNA (small ribosomal subunit RNA) 將生命的型態劃分成三個Domains 66 【圖六】三種發生在細菌的水平基因轉移模示 67 【圖七】不同的基因有著不同的演化歷史 68 【圖八】藉全基因體比較尋找最小基因組 (minimal gene-set) 69 【圖九】GPDB 資料庫綱要(database schema) 70 【圖十】二維電泳理論 71 【圖十一】Virtual 2D Gel流程圖 72 【圖十二】GPDB建置流程圖 73 【圖十三】GPDB進站畫面、登入畫面及註冊畫面 74 【圖十四】GPDB Browse的畫面 75 【圖十五】GPDB網頁基本資訊 76 【圖十六】H. pylori 26695核酸組成 77 【圖十七】H. pylori 26695 Total AT與GC Skew 78 【圖十八】H. pylori 26695 di-Nucleotide頻率分佈 79 【圖十九】H. pylori 26695 tri-Nucleotide頻率分佈 80 【圖二十】H. pylori 26695 codon usage 81 【圖二十一】H. pylori 26695 codon usage 82 【圖二十二】H. pylori 26695分組胺基酸組成分佈 83 【圖二十三】H. pylori 26695 二十種胺基酸組成分佈 84 【圖二十四】H. pylori 26695穿膜蛋白質分佈圓餅圖 85 【圖二十五】H. pylori 26695蛋白質帶電性分佈圓餅圖 86 【圖二十六】H. pylori 26695蛋白質等電點分佈 87 【圖二十七】H. pylori 26695蛋白質分子量分佈圖 88 【圖二十八】H. pylori 26695蛋白質長度分佈圖 89 【圖二十九】模擬H. pylori 26695二維電泳 90 【圖三十】模擬H. pylori 26695二維電泳 91 【圖三十一】GPDB網頁Compare的畫面 92 【圖三十二】AT與GC Content的比較 93 【圖三十三】ATGC Composition的比較 94 【圖三十四】Di-nucleotide Composition比較 95 【圖三十五】Tri-nucleotide composition的比較 96 【圖三十七】Di-peptide composition的比較 98 【圖三十八】蛋白質長度分佈比較 99 【圖三十九】蛋白質分子量分佈比較 100 【圖四十】蛋白質等電點分佈比較 101 【圖四十一】Total codon usage比較 102 【圖四十二】Total codon usage比較 103 【圖四十三】14個蛋白質等電點分佈呈現單尾的genomes 104 【圖四十四】 GPDB中429117條蛋白質之等電點分佈圖 105 【圖四十五】等電點分佈 106 【圖四十六】H. pylori 26695等電點與其同源蛋白質差異大於3的圖示 107 【圖四十七】Total AT Skew統計 108 【圖四十八】Total GC Skew統計 109

    Akman, L., A. Yamashita, H. Watanabe, K. Oshima, T. Shiba, M. Hattori, and S. Aksoy. 2002. Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat Genet 32: 402-407.
    Altschul, S.F., T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D.J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389-3402.
    Andreeva, A., D. Howorth, S.E. Brenner, T.J. Hubbard, C. Chothia, and A.G. Murzin. 2004. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 32 Database issue: D226-229.
    Bocs, S., S. Cruveiller, D. Vallenet, G. Nuel, and C. Medigue. 2003. AMIGene: Annotation of MIcrobial Genes. Nucleic Acids Res 31: 3723-3726.
    Bocs, S., A. Danchin, and C. Medigue. 2002. Re-annotation of genome microbial coding-sequences: finding new genes and inaccurately annotated genes. BMC Bioinformatics 3: 5.
    Brendel, V., P. Bucher, I.R. Nourbakhsh, B.E. Blaisdell, and S. Karlin. 1992. Methods and algorithms for statistical analysis of protein sequences. Proc Natl Acad Sci U S A 89: 2002-2006.
    Check, E. 2002. Venter aims for maximum impact with minimal genome. Nature 420: 350.
    Cole, J.R., B. Chai, T.L. Marsh, R.J. Farris, Q. Wang, S.A. Kulam, S. Chandra, D.M. McGarrell, T.M. Schmidt, G.M. Garrity, and J.M. Tiedje. 2003. The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31: 442-443.
    Daubin, V., N.A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301: 829-832.
    de Bakker, P.I., A. Bateman, D.F. Burke, R.N. Miguel, K. Mizuguchi, J. Shi, H. Shirai, and T.L. Blundell. 2001. HOMSTRAD: adding sequence information to structure-based alignments of homologous protein families. Bioinformatics 17: 748-749.
    Delcher, A.L., D. Harmon, S. Kasif, O. White, and S.L. Salzberg. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27: 4636-4641.
    Devereux, J., P. Haeberli, and O. Smithies. 1984. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12: 387-395.
    dos Reis, M., L. Wernisch, and R. Savva. 2003. Unexpected correlations between gene expression and codon usage bias from microarray data for the whole Escherichia coli K-12 genome. Nucleic Acids Res 31: 6976-6985.
    Fleischmann, R.D., M.D. Adams, O. White, R.A. Clayton, E.F. Kirkness, A.R. Kerlavage, C.J. Bult, J.F. Tomb, B.A. Dougherty, J.M. Merrick, and et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496-512.
    Forsdyke, D.R. and J.R. Mortimer. 2000. Chargaff's legacy. Gene 261: 127-137.
    Fraser, C.M., J.A. Eisen, and S.L. Salzberg. 2000. Microbial genome sequencing. Nature 406: 799-803.
    Frishman, D., K. Albermann, J. Hani, K. Heumann, A. Metanomski, A. Zollner, and H.W. Mewes. 2001. Functional and structural genomics using PEDANT. Bioinformatics 17: 44-57.
    Frishman, D., M. Mokrejs, D. Kosykh, G. Kastenmuller, G. Kolesov, I. Zubrzycki, C. Gruber, B. Geier, A. Kaps, K. Albermann, A. Volz, C. Wagner, M. Fellenberg, K. Heumann, and H.W. Mewes. 2003. The PEDANT genome database. Nucleic Acids Res 31: 207-211.
    Fukuchi, S., K. Yoshimune, M. Wakayama, M. Moriguchi, and K. Nishikawa. 2003. Unique amino acid composition of proteins in halophilic bacteria. J Mol Biol 327: 347-357.
    Gee, H. 2003. Evolution: ending incongruence. Nature 425: 782.
    Gray, S.A. and M.E. Konkel. 1999. Codon usage in the A/T-rich bacterium Campylobacter jejuni. Adv Exp Med Biol 473: 231-235.
    Green, E.D. 2001. Strategies for the systematic sequencing of complex genomes. Nat Rev Genet 2: 573-583.
    Gupta, R.S. 1998. Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev 62: 1435-1491.
    Heymans, M. and A.K. Singh. 2003. Deriving phylogenetic trees from the similarity analysis of metabolic pathways. Bioinformatics 19 Suppl 1: i138-146.
    Hiller, K., M. Schobert, C. Hundertmark, D. Jahn, and R. Munch. 2003. JVirGel: Calculation of virtual two-dimensional protein gels. Nucleic Acids Res 31: 3862-3865.
    Hiscock, D. and C. Upton. 2000. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes. Bioinformatics 16: 484-485.
    Hoersch, S., C. Leroy, N.P. Brown, M.A. Andrade, and C. Sander. 2000. The GeneQuiz web server: protein functional analysis through the Web. Trends Biochem Sci 25: 33-35.
    Hubbard, T.J., B. Ailey, S.E. Brenner, A.G. Murzin, and C. Chothia. 1999. SCOP: a Structural Classification of Proteins database. Nucleic Acids Res 27: 254-256.
    Karlin, S. and L.R. Cardon. 1994. Computational DNA sequence analysis. Annu Rev Microbiol 48: 619-654.
    Karlin, S., J. Mrazek, and A.M. Campbell. 1997. Compositional biases of bacterial genomes and evolutionary implications. J Bacteriol 179: 3899-3913.
    Kawabata, T., S. Fukuchi, K. Homma, M. Ota, J. Araki, T. Ito, N. Ichiyoshi, and K. Nishikawa. 2002. GTOP: a database of protein structures predicted from genome sequences. Nucleic Acids Res 30: 294-298.
    Kennedy, S.P., W.V. Ng, S.L. Salzberg, L. Hood, and S. DasSarma. 2001. Understanding the adaptation of Halobacterium species NRC-1 to its extreme environment through computational analysis of its genome sequence. Genome Res 11: 1641-1650.
    Koonin, E.V. 2003. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat Rev Microbiol 1: 127-136.
    Kreil, D.P. and C.A. Ouzounis. 2001. Identification of thermophilic species by the amino acid compositions deduced from their genomes. Nucleic Acids Res 29: 1608-1615.
    Krogh, A., B. Larsson, G. von Heijne, and E.L. Sonnhammer. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305: 567-580.
    Kunst, F., N. Ogasawara, I. Moszer, A.M. Albertini, G. Alloni, V. Azevedo, M.G. Bertero, P. Bessieres, A. Bolotin, S. Borchert, R. Borriss, L. Boursier, A. Brans, M. Braun, S.C. Brignell, S. Bron, S. Brouillet, C.V. Bruschi, B. Caldwell, V. Capuano, N.M. Carter, S.K. Choi, J.J. Codani, I.F. Connerton, A. Danchin, and et al. 1997. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390: 249-256.
    Lafay, B., J.C. Atherton, and P.M. Sharp. 2000. Absence of translationally selected synonymous codon usage bias in Helicobacter pylori. Microbiology 146 ( Pt 4): 851-860.
    Lander, E.S. L.M. Linton B. Birren C. Nusbaum M.C. Zody J. Baldwin K. Devon K. Dewar M. Doyle W. FitzHugh R. Funke D. Gage K. Harris A. Heaford J. Howland L. Kann J. Lehoczky R. LeVine P. McEwan K. McKernan J. Meldrim J.P. Mesirov C. Miranda W. Morris J. Naylor C. Raymond M. Rosetti R. Santos A. Sheridan C. Sougnez N. Stange-Thomann N. Stojanovic A. Subramanian D. Wyman J. Rogers J. Sulston R. Ainscough S. Beck D. Bentley J. Burton C. Clee N. Carter A. Coulson R. Deadman P. Deloukas A. Dunham I. Dunham R. Durbin L. French D. Grafham S. Gregory T. Hubbard S. Humphray A. Hunt M. Jones C. Lloyd A. McMurray L. Matthews S. Mercer S. Milne J.C. Mullikin A. Mungall R. Plumb M. Ross R. Shownkeen S. Sims R.H. Waterston R.K. Wilson L.W. Hillier J.D. McPherson M.A. Marra E.R. Mardis L.A. Fulton A.T. Chinwalla K.H. Pepin W.R. Gish S.L. Chissoe M.C. Wendl K.D. Delehaunty T.L. Miner A. Delehaunty J.B. Kramer L.L. Cook R.S. Fulton D.L. Johnson P.J. Minx S.W. Clifton T. Hawkins E. Branscomb P. Predki P. Richardson S. Wenning T. Slezak N. Doggett J.F. Cheng A. Olsen S. Lucas C. Elkin E. Uberbacher M. Frazier R.A. Gibbs D.M. Muzny S.E. Scherer J.B. Bouck E.J. Sodergren K.C. Worley C.M. Rives J.H. Gorrell M.L. Metzker S.L. Naylor R.S. Kucherlapati D.L. Nelson G.M. Weinstock Y. Sakaki A. Fujiyama M. Hattori T. Yada A. Toyoda T. Itoh C. Kawagoe H. Watanabe Y. Totoki T. Taylor J. Weissenbach R. Heilig W. Saurin F. Artiguenave P. Brottier T. Bruls E. Pelletier C. Robert P. Wincker D.R. Smith L. Doucette-Stamm M. Rubenfield K. Weinstock H.M. Lee J. Dubois A. Rosenthal M. Platzer G. Nyakatura S. Taudien A. Rump H. Yang J. Yu J. Wang G. Huang J. Gu L. Hood L. Rowen A. Madan S. Qin R.W. Davis N.A. Federspiel A.P. Abola M.J. Proctor R.M. Myers J. Schmutz M. Dickson J. Grimwood D.R. Cox M.V. Olson R. Kaul N. Shimizu K. Kawasaki S. Minoshima G.A. Evans M. Athanasiou R. Schultz B.A. Roe F. Chen H. Pan J. Ramser H. Lehrach R. Reinhardt W.R. McCombie M. de la Bastide N. Dedhia H. Blocker K. Hornischer G. Nordsiek R. Agarwala L. Aravind J.A. Bailey A. Bateman S. Batzoglou E. Birney P. Bork D.G. Brown C.B. Burge L. Cerutti H.C. Chen D. Church M. Clamp R.R. Copley T. Doerks S.R. Eddy E.E. Eichler T.S. Furey J. Galagan J.G. Gilbert C. Harmon Y. Hayashizaki D. Haussler H. Hermjakob K. Hokamp W. Jang L.S. Johnson T.A. Jones S. Kasif A. Kaspryzk S. Kennedy W.J. Kent P. Kitts E.V. Koonin I. Korf D. Kulp D. Lancet T.M. Lowe A. McLysaght T. Mikkelsen J.V. Moran N. Mulder V.J. Pollara C.P. Ponting G. Schuler J. Schultz G. Slater A.F. Smit E. Stupka J. Szustakowski D. Thierry-Mieg J. Thierry-Mieg L. Wagner J. Wallis R. Wheeler A. Williams Y.I. Wolf K.H. Wolfe S.P. Yang R.F. Yeh F. Collins M.S. Guyer J. Peterson A. Felsenfeld K.A. Wetterstrand A. Patrinos M.J. Morgan J. Szustakowki P. de Jong J.J. Catanese K. Osoegawa H. Shizuya S. Choi and Y.J. Chen. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.
    Larsen, H. 1969. Extremely halphilic bacteria. J Gen Microbiol 55: 22-23.
    Lee, D., A. Grant, D. Buchan, and C. Orengo. 2003. A structural perspective on genome evolution. Curr Opin Struct Biol 13: 359-369.
    Lin, J. and M. Gerstein. 2000. Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. Genome Res 10: 808-818.
    Lo Conte, L., B. Ailey, T.J. Hubbard, S.E. Brenner, A.G. Murzin, and C. Chothia. 2000. SCOP: a structural classification of proteins database. Nucleic Acids Res 28: 257-259.
    Lo Conte, L., S.E. Brenner, T.J. Hubbard, C. Chothia, and A.G. Murzin. 2002. SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res 30: 264-267.
    Lukashin, A.V. and M. Borodovsky. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26: 1107-1115.
    Madern, D., C. Ebel, and G. Zaccai. 2000. Halophilic adaptation of enzymes. Extremophiles 4: 91-98.
    Medjahed, D., G.W. Smythers, D.A. Powell, R.M. Stephens, P.F. Lemkin, and D.J. Munroe. 2003. VIRTUAL2D: A web-accessible predictive database for proteomics analysis. Proteomics 3: 129-138.
    Meyer, F., A. Goesmann, A.C. McHardy, D. Bartels, T. Bekel, J. Clausen, J. Kalinowski, B. Linke, O. Rupp, R. Giegerich, and A. Puhler. 2003. GenDB--an open source genome annotation system for prokaryote genomes. Nucleic Acids Res 31: 2187-2195.
    Mizuguchi, K., C.M. Deane, T.L. Blundell, and J.P. Overington. 1998. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 7: 2469-2471.
    Moller, S., M.D. Croning, and R. Apweiler. 2001. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17: 646-653.
    Mushegian, A. 1999. The minimal genome concept. Curr Opin Genet Dev 9: 709-714.
    Mushegian, A.R. and E.V. Koonin. 1996. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc Natl Acad Sci U S A 93: 10268-10273.
    Nakamura, Y., T. Gojobori, and T. Ikemura. 2000. Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28: 292.
    Ng, W.V., S.P. Kennedy, G.G. Mahairas, B. Berquist, M. Pan, H.D. Shukla, S.R. Lasky, N.S. Baliga, V. Thorsson, J. Sbrogna, S. Swartzell, D. Weir, J. Hall, T.A. Dahl, R. Welti, Y.A. Goo, B. Leithauser, K. Keller, R. Cruz, M.J. Danson, D.W. Hough, D.G. Maddocks, P.E. Jablonski, M.P. Krebs, C.M. Angevine, H. Dale, T.A. Isenbarger, R.F. Peck, M. Pohlschroder, J.L. Spudich, K.W. Jung, M. Alam, T. Freitas, S. Hou, C.J. Daniels, P.P. Dennis, A.D. Omer, H. Ebhardt, T.M. Lowe, P. Liang, M. Riley, L. Hood, and S. DasSarma. 2000. Genome sequence of Halobacterium species NRC-1. Proc Natl Acad Sci U S A 97: 12176-12181.
    Novembre, J.A. 2002. Accounting for background nucleotide composition when measuring codon usage bias. Mol Biol Evol 19: 1390-1394.
    Ouzounis, C.A. and P.D. Karp. 2002. The past, present and future of genome-wide re-annotation. Genome Biol 3: COMMENT2001.
    Pace, N.R. 1997. A molecular view of microbial diversity and the biosphere. Science 276: 734-740.
    Pennisi, E. 2003. Molecular biology. Venter cooks up a synthetic genome in record time. Science 302: 1307.
    Philippe, H. and C.J. Douady. 2003. Horizontal gene transfer and phylogenetics. Curr Opin Microbiol 6: 498-505.
    Pieper, U., N. Eswar, H. Braberg, M.S. Madhusudhan, F.P. Davis, A.C. Stuart, N. Mirkovic, A. Rossi, M.A. Marti-Renom, A. Fiser, B. Webb, D. Greenblatt, C.C. Huang, T.E. Ferrin, and A. Sali. 2004. MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res 32 Database issue: D217-222.
    Pieper, U., N. Eswar, A.C. Stuart, V.A. Ilyin, and A. Sali. 2002. MODBASE, a database of annotated comparative protein structure models. Nucleic Acids Res 30: 255-259.
    Pruess, M., W. Fleischmann, A. Kanapin, Y. Karavidopoulou, P. Kersey, E. Kriventseva, V. Mittard, N. Mulder, I. Phan, F. Servant, and R. Apweiler. 2003. The Proteome Analysis database: a tool for the in silico analysis of whole proteomes. Nucleic Acids Res 31: 414-417.
    Reedy, B.V. and P.E. Bourne. 2003. Protein structure evolution and the SCOP database. Methods Biochem Anal 44: 239-248.
    Richard, S.B., D. Madern, E. Garcin, and G. Zaccai. 2000. Halophilic adaptation: novel solvent protein interactions observed in the 2.9 and 2.6 A resolution structures of the wild type and a mutant of malate dehydrogenase from Haloarcula marismortui. Biochemistry 39: 992-1000.
    Rokas, A., B.L. Williams, N. King, and S.B. Carroll. 2003. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425: 798-804.
    Saccone, C. 2003. Handbook of comparative genomics: Principles and methodology. John Wiley & Sons, Inc.
    Salzberg, S.L., A.L. Delcher, S. Kasif, and O. White. 1998. Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26: 544-548.
    Sanchez, R., U. Pieper, N. Mirkovic, P.I. de Bakker, E. Wittenstein, and A. Sali. 2000. MODBASE, a database of annotated comparative protein structure models. Nucleic Acids Res 28: 250-253.
    Scharf, M., R. Schneider, G. Casari, P. Bork, A. Valencia, C. Ouzounis, and C. Sander. 1994. GeneQuiz: a workbench for sequence analysis. Proc Int Conf Intell Syst Mol Biol 2: 348-353.
    Service, R.F. 2001. Proteomics. High-speed biologists search for gold in proteins. Science 294: 2074-2077.
    Shigenobu, S., H. Watanabe, M. Hattori, Y. Sakaki, and H. Ishikawa. 2000. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407: 81-86.
    Singer, G.A. and D.A. Hickey. 2003. Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content. Gene 317: 39-47.
    Sonnhammer, E.L., G. von Heijne, and A. Krogh. 1998. A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6: 175-182.
    Stebbings, L.A. and K. Mizuguchi. 2004. HOMSTRAD: recent developments of the Homologous Protein Structure Alignment Database. Nucleic Acids Res 32 Database issue: D203-207.
    Stuart, G.W., K. Moffett, and S. Baker. 2002. Integrated gene and species phylogenies from unaligned whole genome protein sequences. Bioinformatics 18: 100-108.
    Tatusov, R.L., N.D. Fedorova, J.D. Jackson, A.R. Jacobs, B. Kiryutin, E.V. Koonin, D.M. Krylov, R. Mazumder, S.L. Mekhedov, A.N. Nikolskaya, B.S. Rao, S. Smirnov, A.V. Sverdlov, S. Vasudevan, Y.I. Wolf, J.J. Yin, and D.A. Natale. 2003. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41.
    Tatusov, R.L., M.Y. Galperin, D.A. Natale, and E.V. Koonin. 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28: 33-36.
    Tatusov, R.L., D.A. Natale, I.V. Garkavtsev, T.A. Tatusova, U.T. Shankavaram, B.S. Rao, B. Kiryutin, M.Y. Galperin, N.D. Fedorova, and E.V. Koonin. 2001. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29: 22-28.
    Tsujimoto, K., M. Semadeni, M. Huflejt, and L. Packer. 1988. Intracellular pH of halobacteria can be determined by the fluorescent dye 2', 7'-bis(carboxyethyl)-5(6)-carboxyfluorescein. Biochem Biophys Res Commun 155: 123-129.
    Tyson, G.W., J. Chapman, P. Hugenholtz, E.E. Allen, R.J. Ram, P.M. Richardson, V.V. Solovyev, E.M. Rubin, D.S. Rokhsar, and J.F. Banfield. 2004. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37-43.
    Venter, J.C., K. Remington, J.F. Heidelberg, A.L. Halpern, D. Rusch, J.A. Eisen, D. Wu, I. Paulsen, K.E. Nelson, W. Nelson, D.E. Fouts, S. Levy, A.H. Knap, M.W. Lomas, K. Nealson, O. White, J. Peterson, J. Hoffman, R. Parsons, H. Baden-Tillson, C. Pfannkoch, Y.H. Rogers, and H.O. Smith. 2004. Environmental Genome Shotgun Sequencing of the Sargasso Sea. Science.
    Westphal, S.P. 2002. Your very own sequence. Last month, entrepreneur Craig Venter announced a bold new target: for anyone to be able to get their genome sequenced for under $1000. So can it be done? And what use would it be knowing your genome sequence anyway? New Sci 176: 12-13.
    Wolf, Y.I., I.B. Rogozin, N.V. Grishin, R.L. Tatusov, and E.V. Koonin. 2001. Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol 1: 8.
    Xia, X., T. Wei, Z. Xie, and A. Danchin. 2002. Genomic changes in nucleotide and dinucleotide frequencies in Pasteurella multocida cultured under high temperature. Genetics 161: 1385-1394.
    Zimmer, C. 2003. Genomics. Tinker, tailor: can Venter stitch together a genome from scratch? Science 299: 1006-1007.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE