簡易檢索 / 詳目顯示

研究生: 蔡金良
Tsai, Chin-Liang
論文名稱: 建立醣苷水解酵素家族序列上的共同特徵
Finding Consistent Sequence Patterns in Glycoside Hydrolase (GH) Protein Families
指導教授: 唐傳義
Tang, Chuan Yi
口試委員: 廖崇碩
林俊淵
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 24
中文關鍵詞: 醣苷水解酵素序列排比
外文關鍵詞: Glycoside hydrolases, Multiple sequence alignment
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 此篇論文主要目的為建立糖苷水解酵素家族在催化區域序列上的共同的特徵。糖苷水解酵素資料庫(CAZy)依序列地相似度將其分成125個家族,每一個家族都可以催化一種或多種的水解反應,但是描述序列上特徵的資料庫(PROSITE),只有描述18個糖苷水解家族的特徵。此外在建立序列上特徵時,有些多重序列排比的工具無法一次處理太多的序列,所以我們不使用全部的序列來做序列排比,利用序列一致性來分群,從每群中隨機選出一條代表序列來做。在部分家族中,利用此方法所作出的特徵能夠取代使用全部的序列所做的,可得到更能描述自己家族序列的特徵,而且這些特徵不會因為所選的序列不同而使每次的結果有太大的改變。所以我們利用此方法建立出糖苷水解酵素家族的特徵。


    Finding Consistent Sequence patterns in Glycoside Hydrolases (GH) Protein Families Chin-Liang Tsai,
    Advisor: Professor Chuan Yi Tang Master of Science on Computer Science, National Tsing Hua University, Hsinchu City, Taiwan In this research, our major objective is finding the consistent sequence patterns in GH protein families. GH sequences can be classified into 125 families by the sequence similarities. The PROSITE databases only build 18 motif descriptions or patterns for GH families. Some multiple sequence alignment (MSA) tools cannot handle a large of sequences in addition. We classified GH proteins by sequence identity and randomly selected one sequence from every cluster to build their patterns. We compared with using all sequence to build and our method. The sensitivity of our patterns is greater than using all sequence. The patterns were consistent when we randomly selected sequences. We used this method to construct the patterns for GH family proteins. Key words: Glycoside hydrolases, Multiple sequence alignment, Patten

    中文摘要 iii ABSTRACT iv 誌謝詞 v TABLE OF CONTENTS vi Chapter 1 - Introduction 1 Chapter 2 - Material and Methods 4 2.1 Material 4 2.2 Workflow 4 2.3 Data processing 6 2.4 Sequence alignment 7 2.5 Pattern construction 8 Chapter 3 – Results and discussion 11 3.1 Results 11 3.2 Comparison of patterns with using all and selected sequences 11 3.3 Comparison of patterns with PROSITE and our 12 3.4 Comparison of patterns for different MSA tools 13 3.5 Patterns and active site 14 3.6 Random and star sequences selection comparison 15 Chapter 4 - Conclusions and Future work 16 4.1 Conclusions 16 4.2 Future work 16 REFERENCES 17 APPENDIX 19

    1. Aldrete, M.E.C., Synthesis and Characterization of Glycosides. EDITORIAL 04 TRABAJOS CIENTIFICOS: p. 78.
    2. Webb, E.C., Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. 1992: Academic Press.
    3. Henrissat, B., A classification of glycosyl hydrolases based on amino acid sequence similarities. Biochemical Journal, 1991. 280(Pt 2): p. 309.
    4. Edgar, R.C. and S. Batzoglou, Multiple sequence alignment. Current Opinion in Structural Biology, 2006. 16(3): p. 368-373.
    5. Needleman, S.B. and C.D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 1970. 48(3): p. 443-453.
    6. Katoh, K., et al., MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research, 2005. 33(2): p. 511.
    7. Edgar, R.C., MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 2004. 32(5): p. 1792 -1797.
    8. Notredame, C., D.G. Higgins, and J. Heringa, T-coffee: a novel method for fast and accurate multiple sequence alignment1. Journal of molecular biology, 2000. 302(1): p. 205-217.
    9. Bork, P., Shuffled domains in extracellular proteins. FEBS letters, 1991. 286(1-2): p. 47-54.
    10. Sonnhammer, E.L.L., S.R. Eddy, and R. Durbin, Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins-Structure Function and Genetics, 1997. 28(3): p. 405-420.
    11. Murzin, A.G., et al., SCOP: A structural classification of proteins database for the investigation of sequences and structures. Journal of molecular biology, 1995. 247(4): p. 536-540.
    12. Gough, J. and C. Chothia, SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments. Nucleic Acids Research, 2002. 30(1): p. 268 -272.
    13. Mount, D.W., Bioinformatics: sequence and genome analysis. 2004: CSHL press.
    14. Sigrist, C.J.A., et al., PROSITE: A documented database using patterns and profiles as motif descriptors. Briefings in Bioinformatics, 2002. 3(3): p. 265 -274.
    15. Cantarel, B.L., et al., The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Research, 2009. 37(Database): p. D233-D238.
    16. Bairoch, A., et al., The universal protein resource (UniProt). Nucleic Acids Research, 2005. 33(suppl 1): p. D154.
    17. de Lima Morais, D.A., et al., SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Research, 2011. 39(suppl 1): p. D427.
    18. Bashton, M. and C. Chothia, The generation of new protein functions by the combination of domains. Structure, 2007. 15(1): p. 85-99.
    19. Rost, B., Twilight zone of protein sequence alignments. Protein engineering, 1999. 12(2): p. 85.
    20. Edgar, R.C. MUSCLE : Multiple sequence alignment Faster and more accurate than CLUSTALW. 2010; Available from: http://www.drive5.com/muscle/.
    21. McCarter, J.D. and G. Stephen Withers, Mechanisms of enzymatic glycoside hydrolysis. Current Opinion in Structural Biology, 1994. 4(6): p. 885-892.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE