研究生: |
朱家漢 Chu, Chia-Han |
---|---|
論文名稱: |
基於角度與距離影像比對技術開發出高效率的蛋白質結構比對方法 An efficient protein structural comparison method based on angle-distance image matching techniques |
指導教授: |
唐傳義
Tang, Chuan-Yi |
口試委員: | |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2010 |
畢業學年度: | 99 |
語文別: | 英文 |
論文頁數: | 161 |
中文關鍵詞: | 距離影像圖 、二集結構比對 、蛋白質結構分類 、蛋白質結構比對 、蛋白質功能區域交換 、功能區域交換 、結構排比 、二集結構元素 、功能區域交換偵測 、蛋白質構形疾病 |
外文關鍵詞: | A-D image, secondary structural matching, protein structure classification, protein structural comparison, 3D domain swapping, domain swapping, structural alignment, secondary structural element, domain swapping detection, protein conformational disease |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
蛋白質結構比對與分類經常被運用來了解蛋白質結構與功能之間的演化關係。然而,蛋白質結構比對是非常耗費時間的方法,為了降低計算時間,我們在此論文中發展了一個新穎的蛋白質結構比對方法,透過將一個三維蛋白質結構轉換成二維角度與距離的影像(A-D image),將原本的蛋白質結構比對問題轉換成影像模板比對問題,所提出的方法不僅改善了蛋白質結構比對的效率並且因其獨特的特性,帶來了一些應用,這些應用是傳統方法無法完全解決的問題。
角度與距離影像是對每一個蛋白質的二級結構元素進行向量化程序,並運用任兩個二級結構元素在空間上的角度以及中心點距離關係建立一張角度距離影像。根據任兩個二級結構的組成關係,將每一張距離影像分解成三張不同型態的子影像,接著,即可經由轉換過後的影像,對任兩個蛋白質結構進行互相關的比對分析以便從中發現相似的影像樣版。此方法與二級結構的連結方式以及個別功能區塊在空間上的方向無關,進而可從角度與距離影像上的樣本相似性來判斷蛋白質結構的相似程度。實驗結果證明所提出的方法,即使對低序列相似度的兩個蛋白質,也能夠有效將其正確地分類到由SCOP資料庫定義的折疊層中。
基於距離與角度影像的技巧,本論文提出另一個新穎的方法,得以精準地偵測蛋白質結構區域交換(3D domain swapping; DS)現象的方法。蛋白質結構區域交換是一種形成蛋白質四級結構的機制,可以想像成很多個單體 (monomer)將其原本關閉的結構打開並且將打開的部分互相交換,形成互相纏繞的寡聚體 (oligomer)。自從第一個蛋白質結構區域交換現象被提出之後,越來越多證據說明在適當的環境下並且該結構具有不受限制的兩端,是形成此現象的原因。到目前為止,此區域交換現象已被研究認定與分子演化、功能調節、阿茲海默症(Alzheimer’s disease)以及普粒子疾病(prion disease)等息息相關,除此之外,蛋白質結構區域交換亦可應用在製作生物材料方面上,因此,為了更深入研究此現象形成的原因以及礙於目前研究蛋白質結構區域交換生物資訊資源的缺乏,此方法的發展將促使相關研究領域向前邁進。
總結,在本論文中,我們提出了一個創新的蛋白質結構比對方法以及一個新穎的蛋白質結構區域交換的偵測系統。實驗結果證實了本論文提出之方法比其他現存的工具更能夠有效地偵測蛋白質結構區域交換關係。在未來研究進程上,蛋白質結構區域交換資料庫將被建立並用來促進相關生物工程的發展。
Protein structure comparison (PSC) and classification have been utilized to comprehend evolutionary relationship between protein structures and functions. However, PSC is computationally time-consuming due to the multiple dimensions of geometric information and the complexity of spatial organizations of atoms. In order to reduce the computational complexity, we have developed a novel PSC method by transforming a three-dimensional structure into a two-dimensional angle-distance (A-D) image. By converting geometric comparison problems into image template matching problems, our methodology not only achieves an improved PSC efficiency but also brings about some unique properties and applications that are difficult for conventional PSC methods.
Angle-distance images are created by utilizing secondary structure information of proteins. Subsequently, they are compared by using the cross-correlation approaches which are free from the limitations of the connectivity of secondary structural elements (SSEs) and the spatial orientations of individual domains. Similarities between protein structures are thus identified as various similarities of patterns in A-D images. Our experimental results demonstrate that the proposed method can accurately and efficiently classify protein structures at the fold level defined by the SCOP database even for proteins sharing low sequence identities.
Based on the A-D image techniques developed here, we develop a novel and the first practical detection method for three-dimensional domain swapping (DS). DS is a mechanism for forming protein quaternary structure that can be visualized as if monomers had “opened” their “closed” structures and exchanged the opened portion to form intertwined oligomers. DS has been considered possible to occur in a protein with an unconstrained terminus under appropriate conditions. It may play important roles in the molecular evolution and functional regulation of proteins, and in the course of formation of Alzheimer’s and prion diseases. In addition, DS is promising for the design of auto-assembling biomaterials. Given the increasing interest paid to DS and the lack of bioinformatics resources specifically designed for studying DS, our developments may help move related fields forward.
To sum up, in this dissertation, a new PSC methodology and a novel detection system for DS have been proposed. The results have been demonstrated to be more applicable to detect DS relationships than the well-known existing sequence/structural alignment and domain motion detection methods. In the future, DS database is expected to be built to promote the development of the related biological engineering.
1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res. 28: 235-242.
2. Dror O, Benyamini H, Nussinov R, Wolfson H (2003) MASS: multiple structural alignment by secondary structures. Bioinformatics. 19: i95-i104.
3. Ilyin VA, Abyzov A, Leslin CM (2004) Structural alignemnt of proteins by a novel TOPOFIT method, as a superimposition of common volumne at a topomax point. Protein Sci. 13: 1865-1874.
4. Krissinel E, Henrick K (2005) Multiple Alignment of Protein Structures in Three Dimensions. In: Berthold M.R. et al. (Eds.): CompLife, LNBI 3695, Springer-Verlag Berlin Heidelberg; 67-78.
5. Kolbeck B, May P, Schmidt-Goenner T, Steinke T, Knapp EW (2006) Connectivity independent protein-structure alignment: a hierarchical apparoach. BMC Bioinformatics. 7: 510-530.
6. Yuan X, Bystroff C (2005) Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics. 21: 1010-1019.
7. Szustakowski JD, Weng Z (2000) Protein structure alignment using a genetic algorithm. Proteins. 38: 428-440.
8. Shih ESC, Gan RCR, Hwang MJ (2006) OPAAS: a web server for optimal, permuted, and other alternative alignments of protein structures. Nucleic Acids Res. 34: W95-W98.
9. Krasnogor N, Pelta DA (2004) Measuring the similarity of protein structures by means of the universal similarity metric. Bioinformatics. 20: 1015-1021.
10. Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering. 11: 739-747.
11. Holm L, Sander C (1996) Mapping the protein universe. Science. 273:595-603.
12. Holm L, Park J (2000) DaliLite workbench for protein structure comparison. Bioinformatics. 16:566-567.
13. Lindberg, MO, Tangrot, J., Otzen, D.E., Dolgikh, D.A., Finkelstein, A.V., and Oliveberg, M (2001) Folding of circular permutants with decreased contact order: General trend balanced by protein stability. J. Mol. Biol. 314: 891–900.
14. Carey J, Lindman S, Bauer M, Linse S (2007) Protein reconstitution and three-dimensional domain swapping: Benefits and constraints of covalency. Protein Sci. 16: 2317-2333
15. Gooptu B, Hazes B, Chang WS, Dafforn TR, Carrell RW, Read RJ, Lomas DA (2000) Inactive conformation of the serpin alpha(1)-anti-chymotrypsin indicates two-stage insertion of the reactive loop: implications for inhibitory function and conformational disease. Proc Natl Acad Sci USA. 97: 67-72.
16. Grishin NV, Osterman AL, Brooks HB, Phillips MA, Goldsmith EJ (1999) Xray structure of ornithine decarboxylase from Trypanosoma brucei: the native structure and the structure in complex with alpha-difluoromethylornithine. Biochemistr. 38: 15174-15184.
17. Grishin NV (2001) Fold change in evolution protein structures. J. Struct. Biol. 134:167-185.
18. Tsai LC, Shyur LF, Lee SH, Lin SS, Yuan HS (2003) Crystal structure of a natural circularly permuted jellyroll protein: 1,3-1,4-beta-Dglucanase from Fibrobacter succinogenes. J Mol Biol. 330: 607-620.
19. Levdikov VM, Blagova EV, Brannigan JA, Cladiere L, Antson AA, Isupov MN, Seror SJ, Wilkinson AJ (2004) The crystal structure of YloQ, a circularly permuted GTPase essential for Bacillus subtilis viability. J Mol Biol. 340: 767-782.
20. Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH (2004) Crystal structure of YjeQ from Thermotoga maritima contains a circularly permuted GTPase domain. Proc Natl Acad Sci USA. 101: 13198-13203.
21. Fuentes-Prior P, NoesKe-Jungblut C, Donher P, Schleuning WD, Huber R, Bode W (1997) Structure of the thrombin complex with triabin, a lipocalin-like exosite-lainding inhibitor derived from a triatomunebug. Proc Natl Acod Sci USA. 94: 11845-11850.
22. Bennett MJ, Schlunegger MP, Eisenberg D (1995) 3D domain swapping: a mechanism for oligomer assembly. Protein Sci 4: 2455-2468.
23. Liu Y, Eisenberg D (2002) 3D domain swapping: as domains continue to swap. Protein Sci 11: 1285-1299.
24. Bennett MJ, Eisenberg D (2004) The evolving role of 3D domain swapping in proteins. Structure 12: 1339-1341.
25. Bennett MJ, Choe S, Eisenberg D (1994) Domain swapping: entangling alliances between proteins. Proc Natl Acad Sci U S A 91: 3127-3131.
26. Bennett MJ, Choe S, Eisenberg D (1994) Refined structure of dimeric diphtheria toxin at 2.0 A resolution. Protein Sci 3: 1444-1463.
27. Liu Y, Gotte G, Libonati M, Eisenberg D (2001) A domain-swapped RNase A dimer with implications for amyloid formation. Nat Struct Biol 8: 211-214.
28. Liu Y, Hart PJ, Schlunegger MP, Eisenberg D (1998) The crystal structure of a 3D domain-swapped dimer of RNase A at a 2.1-A resolution. Proc Natl Acad Sci USA 95: 3437-3442.
29. Zegers I, Deswarte J, Wyns L (1999) Trimeric domain-swapped barnase. Proc Natl Acad Sci U S A 96: 818-822.
30. Janowski R, Kozak M, Jankowska E, Grzonka Z, Grubb A, et al. (2001) Human cystatin C, an amyloidogenic protein, dimerizes through three-dimensional domain swapping. Nat Struct Biol 8: 316-320.
31. Janowski R, Abrahamson M, Grubb A, Jaskolski M (2004) Domain swapping in N-truncated human cystatin C. J Mol Biol 341: 151-160.
32. Staniforth RA, Giannini S, Higgins LD, Conroy MJ, Hounslow AM, et al. (2001) Three-dimensional domain swapping in the folded and molten-globule states of cystatins, an amyloid-forming structural superfamily. Embo J 20: 4774-4781.
33. Schiering N, Casale E, Caccia P, Giordano P, Battistini C (2000) Dimer formation through domain swapping in the crystal structure of the Grb2-SH2-Ac-pYVNV complex. Biochemistry 39: 13376-13382.
34. McGee AW, Dakoji SR, Olsen O, Bredt DS, Lim WA, et al. (2001) Structure of the SH3-guanylate kinase module from PSD-95 suggests a mechanism for regulated assembly of MAGUK scaffolding proteins. Mol Cell 8: 1291-1301.
35. Barbosa JA, Sivaraman J, Li Y, Larocque R, Matte A, et al. (2002) Mechanism of action and NAD+-binding mode revealed by the crystal structure of L-histidinol dehydrogenase. Proc Natl Acad Sci U S A 99: 1859-1864.
36. Cameron AD, Olin B, Ridderstrom M, Mannervik B, Jones TA (1997) Crystal structure of human glyoxalase I--evidence for gene duplication and 3D domain swapping. Embo J 16: 3386-3395.
37. Crane BR, Rosenfeld RJ, Arvai AS, Ghosh DK, Ghosh S, et al. (1999) N-terminal domain swapping and metal ion binding in nitric oxide synthase dimerization. Embo J 18: 6271-6281.
38. Schymkowitz JW, Rousseau F, Wilkinson HR, Friedler A, Itzhaki LS (2001) Observation of signal transduction in three-dimensional domain swapping. Nat Struct Biol 8: 888-892.
39. Rousseau F, Schymkowitz JW, Wilkinson HR, Itzhaki LS (2001) Three-dimensional domain swapping in p13suc1 occurs in the unfolded state and is controlled by conserved proline residues. Proc Natl Acad Sci USA 98: 5596-5601.
40. Knaus KJ, Morillas M, Swietnicki W, Malone M, Surewicz WK, et al. (2001) Crystal structure of the human prion protein reveals a mechanism for oligomerization. Nat Struct Biol 8: 770-774.
41. Ogihara NL, Ghirlanda G, Bryson JW, Gingery M, DeGrado WF, et al. (2001) Design of three-dimensional domain-swapped dimers and fibrous oligomers. Proc Natl Acad Sci USA 98: 1404-1409.
42. Jaskolski M (2001) 3D domain swapping, protein oligomerization, and amyloid formation. Acta Biochim Pol 48: 807-827.
43. Nagarkar RP, Hule RA, Pochan DJ, Schneider JP (2010) Domain Swapping in Materials Design. Biopolymers 94: 141-155.
44. Yang S, Cho SS, Levy Y, Cheung MS, Levine H, et al. (2004) Domain swapping is a consequence of minimal frustration. Proc Natl Acad Sci USA 101: 13786-13791.
45. Green SM, Gittis AG, Meeker AK, Lattman EE (1995) One-step evolution of a dimer from a monomeric protein. Nat Struct Biol 2: 746-751.
46. Raag R, Whitlow M (1995) Single-chain Fvs. FASEB J 9: 73-80.
47. Lapatto R, Nalini V, Bax B, Driessen H, Lindley PF, et al. (1991) High resolution structure of an oligomeric eye lens beta-crystallin. Loops, arches, linkers and interfaces in beta B2 dimer compared to a monomeric gamma-crystallin. J Mol Biol 222: 1067-1083.
48. Trinkl S, Glockshuber R, Jaenicke R (1994) Dimerization of beta B2-crystallin: the role of the linker peptide and the N- and C-terminal extensions. Protein Sci 3: 1392-1400.
49. Dehouck Y, Biot C, Gilis D, Kwasigroch JM, Rooman M (2003) Sequence-structure signals of 3D domain swapping in proteins. J Mol Biol 330: 1215-1225.
50. Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19: ii246-255.
51. Zhu J, Weng Z (2005) FAST: a novel protein structure alignment algorithm. Proteins 58: 618-627.
52. Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33: 2302-2309.
53. Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233: 123-138.
54. Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11: 739-747.
55. Lo WC, Huang PJ, Chang CH, Lyu PC (2007) Protein structural similarity search by Ramachandran codes. BMC Bioinformatics 8: 307.
56. Hayward S, Lee RA (2002) Improvements in the analysis of domain motions in proteins from conformational change: DynDom version 1.50. J Mol Graph Model 21: 181-183.
57. Raveh B, Enosh A, Schueler-Furman O, Halperin D (2009) Rapid sampling of molecular motions with prior information constraints. PLoS Comput Biol 5: e1000295.
58. Chun-ting Yeh (葉俊霆), Ping-Chang Lyu (2008) DS-SARST: 利用Ramachandran序列轉換法協助搜尋蛋白質結構之功能區域交換現象. Master’s thesis, National Tsing Hua University.
59. Wei-Cheng Lo (羅惟正), Ping-Chang Lyu (2009) SARST: an efficient protein structural similarity search method applied to the detection of novel protein structural relationships. Ph.D. Dissertation, National Tsing Hua University.
60. Lo WC, Lyu PC (2008) CPSARST: an efficient circular permutation search tool applied to the detection of novel protein structural relationships. Genome Biol 9: R11.
61. Ding F, Prutzman KC, Campbell SL, Dokholyan NV (2006) Topological determinants of protein domain swapping. Structure 14: 5-14.
62. Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 24: 1596-1599.
63. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. 1997. CATH- A Hierarchic Classification of Protein Domain Structures. Structure. 5: 1093-1108.
64. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 22: 2577-2637.
65. Barthel D, Hirst JD, Blazewicz J, Krasnogor N (2007) ProCKSI: A Decision Support System for Protein (Structure) Comparison, Knowledge, Similarity and Information, BMC Bioinformatics. 8: 416-438.
66. Hanks S, Hunter T (1995) The eurkaryotic protein kinase super-family: kinase (catalytic) domain structure and classification. The FASEB Journal. 9:576-596.
67. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 247: 536-540.
68. Leluk J, Konieczny L, Roterman I (2003) Search for structural similarity in proteins Bioinformatics. 19: 117–124.
69. Wang Y, Wu LY, Zhang JH, Zhan ZW, Zhang XS, Chen L (2007) Evaluating protein similarity from coarse structures. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 6(4):583-593
70. Chew LP, Kedem K (2002) Finding consensus shape for a protein family. Proc. 18th Annual ACM Symposium on Computational Geometry.; 64-73.
71. Krasnogor N (2004) Self-generating metaheuristics in bioinformatics: The proteins structure comparison case. J Genet Program Evolv Mach. 5: 181-201.
72. Caprara A and Lancia G (2002) Structural alignment of large-size proteins via lagrangian relaxation. Proceedings of RECOMB, ACM press, Washington, DC.; 100-108.
73. Carr B, Hart W, Krasnogor N, Burke E, Hist J, Smith J (2002) Alignment of protein structures with a memetic evolutionary algorithm. In Proc. Genetic and Evolutionary Computation Conf.; 1027-1034.
74. Lancia G, Carr R, Walenz B, Istrail S (2001) 101 optimal pdb structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem. Proc. Of 5th ACM RECOMB; 193-202.
75. Petretti C, Prigent C (2005) The Protein Kinase Resource: everything you always wanted to know about protein kinases but were afraid to ask. Biol Cell. 97: 113-118.
76. Smith C (1999) The protein kinase resource and other bioinformation resources. Prog Biophys Mol Biol. 71:525-533.
77. Smith CM, Shindyalov IN, Veretnik S, Gribskov M, Taylor SS, Eyck LFT, Bourne PE (1997) The protein kinase resource. Trends Biochem Sci. 22: 444-446.
78. Shih ESC, Hwang MJ (2003) Protein structure comparison by probability-based matching of secondary structure elements. Bioinformatics. 19: 735-741.
79. Chang R-H, Wang L-J, Chen J-M, Pai T-W (2007) Enhanced mutual Correlation of secondary structure elements for multiple structure alignment. Proc. 10th Joint Conference on Information Sciences, Salt Lake City: World Scientific Publishing; 1-7.
80. Chu CH, Tang CY, Tang CY, Pai TW (2008) Angle-distance image matching techniques for protein structure comparison. J Mol Recognit 21: 442-452.
81. Henrick K, Thornton JM (1998) PQS: a protein quaternary structure file server. Trends Biochem Sci 23: 358-361.
82. Lu GG (2000) TOP: a new method for protein structure comparisons and similarity searches. Journal of Applied Crystallography 33: 176-183.
83. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403-410.
84. Vesterstrom J, Taylor WR (2006) Flexible secondary structure based protein structure comparison applied to the detection of circular permutation. J Comput Biol 13: 43-63.
85. Jung J, Lee B (2000) Protein structure alignment using environmental profiles. Protein Eng 13: 535-543.
86. Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 60: 2256-2268.
87. Suhrer SJ, Wiederstein M, Sippl MJ (2007) QSCOP--SCOP quantified by structural relationships. Bioinformatics 23: 513-514.
88. Hasegawa H, Holm L (2009) Advances and pitfalls of protein structural alignment. Curr Opin Struct Biol 19: 341-348.
89. Kolodny R, Koehl P, Levitt M (2005) Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol 346: 1173-1188.
90. Alexandrov NN, Fischer D (1996) Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures. Proteins 25: 354-365.
91. Sauder JM, Arthur JW, Dunbrack RL, Jr. (2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40: 6-22.
92. Jmol: an open-source Java viewer for chemical structures in 3D [http://www.jmol.org/].
93. DeLano WL (2002) The PyMOL molecular graphics system. San Carlos, CA, USA: DeLano Scientific.
94. Java OpenGL [http://jogamp.org/].
95. Lo WC, Lee CY, Lee CC, Lyu PC (2009) iSARST: an integrated SARST web server for rapid protein structural similarity searches. Nucleic Acids Res 37: W545-551.
96. Holm L, Kaariainen S, Rosenstrom P, Schenkel A (2008) Searching protein structure databases with DaliLite v.3. Bioinformatics 24: 2780-2781.
97. Lo WC, Lee CC, Lee CY, Lyu PC (2009) CPDB: a database of circular permutation in proteins. Nucleic Acids Res 37: D328-332.
98. Shih ES, Hwang MJ (2004) Alternative alignments from comparison of protein structures. Proteins 56: 519-527.
99. Dundas J, Binkowski TA, DasGupta B, Liang J (2007) Topology independent protein structural alignment. BMC Bioinformatics 8: 388.
100. Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A 32: 922-923.
101. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89: 10915-10919.