簡易檢索 / 詳目顯示

研究生: 羅惟正
Lo, Wei-Cheng
論文名稱: 蛋白質結構快速搜尋法SARST之開發及其於特殊結構偵測之應用
SARST: an efficient protein structural similarity search method applied to the detection of novel protein structural relationships
指導教授: 呂平江
Lyu, Ping-Chiang
口試委員:
學位類別: 博士
Doctor
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 150
中文關鍵詞: 線性編碼蛋白質結構蛋白質結構搜尋比對蛋白質結構環狀序列重組蛋白質之結構區域交換結構重組
外文關鍵詞: linear encoding, protein structure, SARST, circular permutation, three-dimensional domain swapping, structural rearrangement
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • The journey toward comparative protein structural bioinformatics that this thesis will guide you starts with a linear encoding algorithm for three-dimensional (3D) protein structures. In addition to the applications of this algorithm to structural similarity searches, following the letters upon the journey several interesting biological phenomena will be introduced, inclusive of circular permutation (CP), 3D domain swapping (DS) and complicated structural rearrangements in proteins.
    First of all, protein structural data is increasing exponentially nowadays, such that more efficient tools are required to access structure similarity search. In this work, Ramachandran Sequential Transformation (RST) algorithm has been proposed to linearly encode protein structural data into a sequence form and a database search tool SARST (Structural similarity search Aided by RST) has been developed. SARST is comparable to Combinatorial Extension in terms of accuracy while running ~240,000 times faster. Based on this speed advantage and the unique properties of linearly- encoded structural data, RST has been successfully applied to detecting interesting protein structural relationships like CP and DS, which are difficult to be identified by using conventional methods. CP is an evolutionary event resulting in the fact that structurally similar proteins may have different locations of termini. Because of the complicated rearrangement nature of CP, convenient computational resources were not available yet for the study of CP before our work. We have developed an efficient search tool CPSARST (CP Search Aided by RST), the first CP database (CPDB) and a viable CP site predictor CPred. DS is a mechanism for forming protein quaternary structures from monomers. It is defined as two or more protein chains exchanging part of their identical structure to form intertwined oligomers. A new DS-detecting method, DS-SARST (DS Search Aided by RST), is described in this thesis. Finally, a multi- processor, batch-processing web server iSARST is introduced before we end this article with several proposed future works related to RST. iSARST is an integrated implementation of several structural comparison tools and RST-based searching methods. It can be used to retrieve common structural homologs from large databases or to identify DS, CP and other structural rearrangements more complicated than CP.
    To sum up, the algorithms, searching methods and web services described in this thesis provide scientists an efficient, innovative set of resources to search, compare and analyze protein structures, especially those with novel relationships. We expect that they can be helpful for the understanding of protein evolutionary mechanisms and will facilitate the application of structural permutation and 3D domain swapping in bioengineering and biotechnology fields.


    此論文將透過一個蛋白質三維結構線性編碼方法,帶領讀者進行一趟蛋白質結構生物學之旅。除了闡述此線性編碼方法如何應用於蛋白質結構相似性搜尋比對外,本文還能使讀者了解幾個有趣的生物現象,如蛋白質結構環狀序列重組 (circular permutation; CP)、蛋白質結構之區域交換現象 (3D domain swapping; DS)、以及複雜的蛋白質結構重組。
    目前蛋白質結構資料量正以指數型態快速增加,因此,發展高效率的蛋白質結構比對搜尋方法對結構生物學研究而言非常重要。在此研究中,我們提出一個「拉馬銓德朗氏線性轉換法 (Ramachandran Sequential Transformation; RST)」,能將蛋白質三維結構資料轉換為一維文字字串。此外並發展了一個高速蛋白質結構比對搜尋工具SARST (Structural similarity search Aided by RST)。SARST在結構搜尋比對上的準確率不但接近Combinatorial Extension方法,速度更有其二十四萬多倍快。立基於這樣的速度優勢及運用一維化結構資訊之特性,我們已成功將RST演算法應用來偵測新穎而有趣的蛋白質結構關連性,如蛋白質結構環狀序列重組和蛋白質結構之區域交換現象。這些都是傳統方法不容易偵測出來的。蛋白質結構環狀序列重組是一連串演化事件所致,可使結構極相似的蛋白質擁有不同開口。換言之,它可看作是某蛋白質原來的N端與C端被接在一起,然後於另一處產生開口。因為這種複雜的重組關係不易分析,所以在我們的研究工作開始前,相關的生物資訊資源一直很少。在此,我們發展了一個高效能的蛋白質結構環狀序列重組搜尋工具CPSARST (CP Search Aided by RST)、世界上第一個蛋白質結構環狀序列重組資料庫CPDB及一個CP切位預測程式CPred (CP site predictor)。蛋白質結構區域交換現象是種單體 (monomer) 蛋白質互相結合為具四級結構之寡體 (oligomer) 的機制。此現象的定義是:兩個或多個蛋白質間相互交換一段相同的結構區域而成為外型交錯的寡體。此論文中,我們將介紹一個新的蛋白質結構區域交換偵測程式DS-SARST (DS Search Aided by RST)。除上述工具外,我們還開發了一個多工且批次化的蛋白質結構搜尋比對網路服務器,名為iSARST。iSARST整合了各RST相關搜尋工具及數種知名的結構比對方法,用以提供高速度且高準確率的蛋白質結構比對搜尋服務。它可偵測一般的蛋白質結構類似物,也可偵測蛋白質結構區域交換或蛋白質結構環狀序列重組現象,亦能偵測更複雜型態的蛋白質結構重組。文末,我們提出數個RST的未來應用方向,藉此充分說明其潛力。
    綜合而言,本論文中所描述的演算法、搜尋方法及網路服務提供了科學家們一組有創意而高效能的生物資訊資源,可用以搜尋、比對與分析蛋白質結構,尤其是分析蛋白質間的新穎性結構關連性。我們預期這些資源將有助人們了解蛋白質的演化,並可加速蛋白質結構重組及結構區域交換現象之相關技術於蛋白質工程和生物科技領域的應用。

    中文摘要 1 Abstract 2 謝誌 3 Contents 5 Introduction 9 Chapter 1. Ramachandran Sequential Transformation (RST) and SARST: a linear encoding methodology for protein structural data and its application to protein structural similarity searches 15 1.1 Background 16 1.2 Results 19 1.2.1 Algorithm – Ramachandran sequential transformation (RST) 19 1.2.2 Building scoring matrices – a regenerative approach 21 1.2.3 Optimization of the scoring matrix 21 1.2.4 Evaluation of speed 23 1.2.5 Evaluation of accuracy 23 1.2.6 Implementation: Performance using different structural classes 23 1.2.7 Performance on incomplete structures 25 1.2.8 Effects of low sequence identities 26 1.2.9 Reliability of searching results 27 1.2.10 Normalization of SARST scores 27 1.2.11 Distantly related homologs retrieved by SARST: two examples 28 1.3 Discussion 31 1.3.1 On speed 31 1.3.2 On accuracy 31 1.3.3 On improvements 32 1.3.4 Significance of SARST score 33 1.3.5 Expected applications of SARST 34 1.4 Conclusions 35 1.5 Methods 35 1.5.1 Optimization of the search engine parameters 36 1.5.2 Practical parameter settings for SARST 36 1.5.3 Assessment of speed and precision 36 1.6 Acknowledgements 37 1.7 Tables 38 Chapter 2. CPSARST: an application of RST to the detection of circular permutation in proteins 41 2.1 Background 42 2.2 Results 46 2.2.1 Performance on random circular permutants 46 2.2.2 Accuracy evaluations with engineered CPs 46 2.2.3 Pair-wise comparisons of naturally occurring CPs 48 2.2.4 Protein structural database searches 49 2.2.5 Novel CP family detected by CPSARST 50 2.2.6 Circular permutants detected by CPSARST 52 2.3 Discussion 55 2.3.1 Detecting circular permutants with low sequence identities 55 2.3.2 Speed improvements 55 2.3.3 The prevalence and definition of circular permutation 56 2.3.4 Possible applications of CPSARST 58 2.4 Conclusions 59 2.5 Materials and methods 59 2.5.1 Linear encoding of protein structures 60 2.5.2 Generation and analyses of random circular permutants 60 2.5.3 Screening of CP candidates 61 2.5.4 Refinement of the search results 61 2.5.5 Pair-wise CP structural alignments 62 2.5.6 Implementation 63 2.6 Acknowledgements 64 2.7 Tables 65 Chapter 3. CPDB: the first database of circular permutation in proteins 70 3.1 Background 71 3.2 Contents and methods 72 3.2.1 Identification of circular permutation 72 3.2.2 Categorization of circular permutants 73 3.2.3 Circularly-permuted alignments and the visualization of CP relationships 74 3.2.4 Prediction of viable circular permutants 75 3.3 Web interface 75 3.4 Future works 77 3.5 Acknowledgements 77 Chapter 4. DS-SARST: an efficient three-dimensional domain swapping search tool based on RST methodology 78 4.1 Introduction 80 4.2 Results and Discussions 84 4.2.1 Performance of the SVM classifier in DS-SARST 84 4.2.2 Experiments on published DS pairs 85 4.2.3 Hinge loop detection 87 4.3 Conclusion and future works 88 4.4 Materials and methods 88 4.4.1 Linear encoding of protein structures 88 4.4.2 The double filter-and-refine strategy 89 4.4.3 Alternative structure alignment 89 4.4.4 Preparation of training dataset 91 4.4.5 Setting up Filter I 94 4.4.6 Setting up the SVM classifier 94 4.4.7 Hinge loop detections 96 4.5 Acknowledgements 97 4.6 Tables 98 Chapter 5. iSARST: an integrated web server for rapid protein structural alignment searches 102 5.1 Introduction 103 5.2 Methods 105 5.2.1 Linear encoding of protein structures 107 5.2.2 Structural similarity searches 107 5.2.3 Refinement of searching results 107 5.2.4 Multi-processor implementations 108 5.3 Experiments 109 5.4 Web server description 110 5.4.1 Input and the searching page 110 5.4.2 Output: hit list 111 5.4.3 Output: structure inspection page 111 5.5 Expected applications 113 5.6 Acknowledgements 113 5.7 Tables 114 Chapter 6. Ongoing projects related to SARST and other possible applications of RST 116 6.1 Prediction of viable circular permutation sites using weighted closeness 118 6.1.1 Background 118 6.1.2 Methods 118 6.1.3 Preliminary results 120 6.1.4 Future works 121 6.2 FASARST: a potential method for the detection of complicated protein structural rearrangements by fragmented alignment searches 127 6.2.1 Background 127 6.2.2 Methods 128 6.2.3 Future works 128 6.3 Other possible applications of RST and SARST-related proposals 130 Chapter 7. Concluding remarks 132 References 135 Appendices 147 Appendix 1.1 The training set of SARST 147 Appendix 1.2 Performances of SARST scoring matrices 147 Appendix 1.3 The query and target proteins for information retrieval experiments 147 Appendix 1.4 Effects of gap penalties 147 Appendix 1.5 Query proteins with incomplete structures 148 Appendix 2.1 The nrPDB-90 dataset 148 Appendix 2.2 The nrSCOP-90 dataset 148 Appendix 2.3 Candidate CP pairs in nrPDB-90 detected by CPSARST 148 Appendix 2.4 Structural neighbors of protein YlqF retrieved by DALI 149 Appendix 2.5 Additional statistics of CPSARST database searching 149 Appendix 2.6 The RCP dataset 149 Appendix 2.7 Score and E-value ratios calculated from the RCP dataset 149 Appendix 2.8 Parameter settings of CPSARST used in this study 150 Appendix 3.1 The non-redundant PDB dataset used for the development of CPDB 150 Appendix 3.2 Classification of proteins according to the secondary structural elemental contents 150

    1. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC: A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature 1958, 181(4610):662-666.
    2. About the PDB Archive and the RCSB PDB [http://www.pdb.org/pdb/static.do?p=general_information/about_pdb/index.html].
    3. Holm L, Sander C: Protein structure comparison by alignment of distance matrices. J Mol Biol 1993, 233(1):123-138.
    4. Madej T, Gibrat JF, Bryant SH: Threading a database of protein cores. Proteins 1995, 23(3):356-369.
    5. Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 1998, 11(9):739-747.
    6. Sauder JM, Arthur JW, Dunbrack RL, Jr.: Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 2000, 40(1):6-22.
    7. Zhang Y, Skolnick J: TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 2005, 33(7):2302-2309.
    8. Zhu J, Weng Z: FAST: a novel protein structure alignment algorithm. Proteins 2005, 58(3):618-627.
    9. Huang PJ: SARST: structure alignment by Ramachandran search tool. Hsinchu, Taiwan, R.O.C.: National Tsing Hua University; 2003.
    10. Jain AK, Dubes RC: Algorithms for clustering data. New Jersey: Prentice Hall; 1988.
    11. Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 1992, 89(22):10915-10919.
    12. Lo WC, Huang PJ, Chang CH, Lyu PC: Protein structural similarity search by Ramachandran codes. BMC Bioinformatics 2007, 8:307.
    13. Tsai LC, Shyur LF, Lee SH, Lin SS, Yuan HS: Crystal structure of a natural circularly permuted jellyroll protein: 1,3-1,4-beta-D-glucanase from Fibrobacter succinogenes. J Mol Biol 2003, 330(3):607-620.
    14. Ribeiro EA, Jr., Ramos CH: Circular permutation and deletion studies of myoglobin indicate that the correct position of its N-terminus is required for native stability and solubility but not for native-like heme binding and folding. Biochemistry 2005, 44(12):4699-4709.
    15. Lo WC, Lyu PC: CPSARST: an efficient circular permutation search tool applied to the detection of novel protein structural relationships. Genome Biol 2008, 9(1):R11.
    16. Lo WC, Lee CC, Lee CY, Lyu PC: CPDB: a database of circular permutation in proteins. Nucleic Acids Res 2009, 37(Database issue):D328-332.
    17. Yuan X, Bystroff C: Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics 2005, 21(7):1010-1019.
    18. Kolbeck B, May P, Schmidt-Goenner T, Steinke T, Knapp EW: Connectivity independent protein-structure alignment: a hierarchical approach. BMC Bioinformatics 2006, 7:510.
    19. Vesterstrom J, Taylor WR: Flexible secondary structure based protein structure comparison applied to the detection of circular permutation. J Comput Biol 2006, 13(1):43-63.
    20. Chen L, Wu LY, Wang Y, Zhang S, Zhang XS: Revealing divergent evolution, identifying circular permutations and detecting active-sites by protein structure comparison. BMC Struct Biol 2006, 6:18.
    21. Bennett MJ, Schlunegger MP, Eisenberg D: 3D domain swapping: a mechanism for oligomer assembly. Protein Sci 1995, 4(12):2455-2468.
    22. Liu Y, Eisenberg D: 3D domain swapping: as domains continue to swap. Protein Sci 2002, 11(6):1285-1299.
    23. Bennett MJ, Eisenberg D: The evolving role of 3D domain swapping in proteins. Structure 2004, 12(8):1339-1341.
    24. Lo WC, Lee CY, Lee CC, Lyu PC: iSARST: an integrated SARST web server for rapid protein structural similarity searches. Nucleic Acids Res 2009, 37(Web Server issue):W545-551.
    25. Chothia C, Lesk AM: The relation between the divergence of sequence and structure in proteins. Embo J 1986, 5(4):823-826.
    26. Murzin AG: How far divergent evolution goes in proteins. Curr Opin Struct Biol 1998, 8(3):380-387.
    27. Chou KC, Carlacci L: Energetic approach to the folding of alpha/beta barrels. Proteins 1991, 9(4):280-295.
    28. Lasters I, Wodak SJ, Alard P, van Cutsem E: Structural principles of parallel beta-barrels in proteins. Proc Natl Acad Sci U S A 1988, 85(10):3338-3342.
    29. Lasters I, Wodak SJ, Pio F: The design of idealized alpha/beta-barrels: analysis of beta-sheet closure requirements. Proteins 1990, 7(3):249-256.
    30. Murzin AG, Lesk AM, Chothia C: Principles determining the structure of beta-sheet barrels in proteins. I. A theoretical analysis. J Mol Biol 1994, 236(5):1369-1381.
    31. Richardson JS, Richardson DC: Principles and patterns of protein conformation. In: Prediction of protein structure and the principles of protein conformations. New York: Plenum; 1989: 1-98.
    32. Sheridan RP, Dixon JS, Venkataraghavan R, Kuntz ID, Scott KP: Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures. Biopolymers 1985, 24(10):1995-2023.
    33. Mitchell EM, Artymiuk PJ, Rice DW, Willett P: Use of techniques derived from graph theory to compare secondary structure motifs in proteins. J Mol Biol 1990, 212(1):151-166.
    34. Levine M, Stuart D, Williams J: A method for the systematic comparison of the three-dimensional structures of proteins and some results. Acta Cryst 1984, A40:600-610.
    35. Efimov AV: Standard structures in proteins. Prog Biophys Mol Biol 1993, 60(3):201-239.
    36. Lesk AM: Application of sequence alignment methods to multiple structural alignment and superposition. In: Proceedings of Prague Stringology Club Workshop '98. Prague; 1998: 95-100.
    37. Martin AC: The ups and downs of protein topology; rapid comparison of protein structure. Protein Eng 2000, 13(12):829-837.
    38. Guyon F, Camproux AC, Hochez J, Tuffery P: SA-Search: a web tool for protein structure mining based on a Structural Alphabet. Nucleic Acids Res 2004, 32(Web Server issue):W545-548.
    39. Shi J, Blundell TL, Mizuguchi K: FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 2001, 310(1):243-257.
    40. Kelley LA, MacCallum RM, Sternberg MJ: Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 2000, 299(2):499-520.
    41. Carpentier M, Brouillet S, Pothier J: YAKUSA: a fast structural database scanning method. Proteins 2005, 61(1):137-151.
    42. Yang JM, Tung CH: Protein structure database search and evolutionary classification. Nucleic Acids Res 2006, 34(13):3646-3659.
    43. Tyagi M, Gowri VS, Srinivasan N, de Brevern AG, Offmann B: A substitution matrix for structural alphabet based on structural alignment of homologous proteins and its applications. Proteins 2006, 65(1):32-39.
    44. de Brevern AG, Benros C, Gautier R, Valadie H, Hazout S, Etchebest C: Local backbone structure prediction of proteins. In Silico Biol 2004, 4(3):381-386.
    45. de Brevern AG, Etchebest C, Hazout S: Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Proteins 2000, 41(3):271-287.
    46. Etchebest C, Benros C, Hazout S, de Brevern AG: A structural alphabet for local protein structures: improved prediction methods. Proteins 2005, 59(4):810-827.
    47. De Brevern AG, Etchebest C, Benros C, Hazout S: "Pinning strategy": a novel approach for predicting the backbone structure in terms of protein blocks from sequence. J Biosci 2007, 32(1):51-70.
    48. Ramachandran GN, Sasisekharan V: Conformation of polypeptides and proteins. Adv Protein Chem 1968, 23:283-438.
    49. Aung Z, Tan KL: Rapid 3D protein structure database searching using information retrieval techniques. Bioinformatics 2004, 20(7):1045-1052.
    50. Bertino E, Ooi BC, Sacks-Davis R, Tan KL, Zobel J, Shidlovsky B, Catania B: Indexing techniques for advanced database systems: Kluwer Academic Publishers; 1997.
    51. Brenner SE, Koehl P, Levitt M: The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res 2000, 28(1):254-256.
    52. Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE: The ASTRAL Compendium in 2004. Nucleic Acids Res 2004, 32(Database issue):D189-192.
    53. Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE: ASTRAL compendium enhancements. Nucleic Acids Res 2002, 30(1):260-263.
    54. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215(3):403-410.
    55. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402.
    56. Hripcsak G, Rothschild AS: Agreement, the f-measure, and reliability in information retrieval. J Am Med Inform Assoc 2005, 12(3):296-298.
    57. Salton G, McGill MJ: Retrieval evaluation. In: Introduction to modern information retrieval. New York: McGraw-Hill; 1983: 174-177.
    58. Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R, Thornton JM: AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR 1996, 8(4):477-486.
    59. Camproux AC, Gautier R, Tuffery P: A hidden markov model derived structural alphabet for proteins. J Mol Biol 2004, 339(3):591-605.
    60. Camproux AC, Tuffery P, Chevrolat JP, Boisvieux JF, Hazout S: Hidden Markov model approach for identifying the modular framework of the protein backbone. Protein Eng 1999, 12(12):1063-1073.
    61. Altschul SF, Gish W: Local alignment statistics. Methods Enzymol 1996, 266:460-480.
    62. Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C: Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 1998, 284(4):1201-1210.
    63. Getz G, Vendruscolo M, Sachs D, Domany E: Automated assignment of SCOP and CATH protein structure classifications from FSSP scores. Proteins 2002, 46(4):405-415.
    64. Holm L, Sander C: The FSSP database of structurally aligned protein fold families. Nucleic Acids Res 1994, 22(17):3600-3609.
    65. Higgins DG, Thompson JD, Gibson TJ: Using CLUSTAL for multiple sequence alignments. Methods Enzymol 1996, 266:383-402.
    66. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22(22):4673-4680.
    67. Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 2004, 5:113.
    68. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792-1797.
    69. DeLano WL: The PyMOL molecular graphics system. In. San Carlos, CA, USA: DeLano Scientific; 2002.
    70. Protein-protein BLAST [http://www.ncbi.nlm.nih.gov/blast/Blast.cgi].
    71. Jeltsch A: Circular permutations in the molecular evolution of DNA methyltransferases. J Mol Evol 1999, 49(1):161-164.
    72. Weiner J, 3rd, Thomas G, Bornberg-Bauer E: Rapid motif-based prediction of circular permutations in multi-domain proteins. Bioinformatics 2005, 21(7):932-937.
    73. Cunningham BA, Hemperly JJ, Hopp TP, Edelman GM: Favin versus concanavalin A: Circularly permuted amino acid sequences. Proc Natl Acad Sci U S A 1979, 76(7):3218-3222.
    74. Lindqvist Y, Schneider G: Circular permutations of natural protein sequences: structural evidence. Curr Opin Struct Biol 1997, 7(3):422-427.
    75. Murzin AG: Probable circular permutation in the flavin-binding domain. Nat Struct Biol 1998, 5(2):101.
    76. Castillo RM, Mizuguchi K, Dhanaraj V, Albert A, Blundell TL, Murzin AG: A six-stranded double-psi beta barrel is shared by several protein superfamilies. Structure 1999, 7(2):227-236.
    77. Polekhina G, Board PG, Gali RR, Rossjohn J, Parker MW: Molecular basis of glutathione synthetase deficiency and a rare gene permutation event. Embo J 1999, 18(12):3204-3213.
    78. Bujnicki JM: Sequence permutations in the molecular evolution of DNA methyltransferases. BMC Evol Biol 2002, 2:3.
    79. Jung J, Lee B: Protein structure alignment using environmental profiles. Protein Eng 2000, 13(8):535-543.
    80. Antcheva N, Pintar A, Patthy A, Simoncsits A, Barta E, Tchorbanov B, Pongor S: Proteins of circularly permuted sequence present within the same organism: the major serine proteinase inhibitor from Capsicum annuum seeds. Protein Sci 2001, 10(11):2280-2290.
    81. Goldenberg DP, Creighton TE: Circular and circularly permuted forms of bovine pancreatic trypsin inhibitor. J Mol Biol 1983, 165(2):407-413.
    82. Vogel C, Morea V: Duplication, divergence and formation of novel protein topologies. Bioessays 2006, 28(10):973-978.
    83. Qian Z, Lutz S: Improving the catalytic activity of Candida antarctica lipase B by circular permutation. J Am Chem Soc 2005, 127(39):13466-13467.
    84. Anantharaman V, Koonin EV, Aravind L: Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains. J Mol Biol 2001, 307(5):1271-1292.
    85. Todd AE, Orengo CA, Thornton JM: Plasticity of enzyme active sites. Trends Biochem Sci 2002, 27(8):419-426.
    86. Bulaj G, Koehn RE, Goldenberg DP: Alteration of the disulfide-coupled folding pathway of BPTI by circular permutation. Protein Sci 2004, 13(5):1182-1196.
    87. Heinemann U, Hahn M: Circular permutations of protein sequence: not so rare? Trends Biochem Sci 1995, 20(9):349-350.
    88. Li L, Shakhnovich EI: Different circular permutations produced different folding nuclei in proteins: a computational study. J Mol Biol 2001, 306(1):121-132.
    89. Chen J, Wang J, Wang W: Transition states for folding of circular-permuted proteins. Proteins 2004, 57(1):153-171.
    90. Schwartz TU, Walczak R, Blobel G: Circular permutation as a tool to reduce surface entropy triggers crystallization of the signal recognition particle receptor beta subunit. Protein Sci 2004, 13(10):2814-2818.
    91. Anand B, Verma SK, Prakash B: Structural stabilization of GTP-binding domains in circularly permuted GTPases: implications for RNA binding. Nucleic Acids Res 2006, 34(8):2196-2205.
    92. Gebhard LG, Risso VA, Santos J, Ferreyra RG, Noguera ME, Ermacora MR: Mapping the distribution of conformational information throughout a protein sequence. J Mol Biol 2006, 358(1):280-288.
    93. Kojima M, Ayabe K, Ueda H: Importance of terminal residues on circularly permutated Escherichia coli alkaline phosphatase with high specific activity. J Biosci Bioeng 2005, 100(2):197-202.
    94. Ostermeier M: Engineering allosteric protein switches by domain insertion. Protein Eng Des Sel 2005, 18(8):359-364.
    95. Galarneau A, Primeau M, Trudeau LE, Michnick SW: Beta-lactamase protein fragment complementation assays as in vivo and in vitro sensors of protein protein interactions. Nat Biotechnol 2002, 20(6):619-622.
    96. Baird GS, Zacharias DA, Tsien RY: Circular permutation and receptor insertion within green fluorescent proteins. Proc Natl Acad Sci U S A 1999, 96(20):11241-11246.
    97. Jung J, Lee B: Circularly permuted proteins in the protein structure database. Protein Sci 2001, 10(9):1881-1886.
    98. Uliel S, Fliess A, Unger R: Naturally occurring circular permutations in proteins. Protein Eng 2001, 14(8):533-542.
    99. Carrington DM, Auffret A, Hanke DE: Polypeptide ligation occurs during post-translational modification of concanavalin A. Nature 1985, 313(5997):64-67.
    100. Ponting CP, Russell RB: Swaposins: circular permutations within genes encoding saposin homologues. Trends Biochem Sci 1995, 20(5):179-180.
    101. Peisajovich SG, Rockah L, Tawfik DS: Evolution of new protein topologies through multistep gene rearrangements. Nat Genet 2006, 38(2):168-174.
    102. Russell RB, Ponting CP: Protein fold irregularities that hinder sequence analysis. Curr Opin Struct Biol 1998, 8(3):364-371.
    103. Uliel S, Fliess A, Amir A, Unger R: A simple algorithm for detecting circular permutations in proteins. Bioinformatics 1999, 15(11):930-936.
    104. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28(1):235-242.
    105. Lu G: Top: A new method for protein structure comparisons and similarity searches. J Appl Cryst 2000, 33:176-183.
    106. Pearson WR: Flexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol 2000, 132:185-219.
    107. Horn C, Sohn-Bosser L, Breed J, Welte W, Schmitt L, Bremer E: Molecular determinants for substrate specificity of the ligand-binding protein OpuAC from Bacillus subtilis for the compatible solutes glycine betaine and proline betaine. J Mol Biol 2006, 357(2):592-606.
    108. Schiefner A, Breed J, Bosser L, Kneip S, Gade J, Holtmann G, Diederichs K, Welte W, Bremer E: Cation-pi interactions as determinants for binding of the compatible solutes glycine betaine and proline betaine by the periplasmic ligand-binding protein ProX from Escherichia coli. J Biol Chem 2004, 279(7):5588-5596.
    109. Schiefner A, Holtmann G, Diederichs K, Welte W, Bremer E: Structural basis for the binding of compatible solutes by ProX from the hyperthermophilic archaeon Archaeoglobus fulgidus. J Biol Chem 2004, 279(46):48270-48281.
    110. Ostermeier C, Brunger AT: Structural basis of Rab effector specificity: crystal structure of the small G protein Rab3A complexed with the effector domain of rabphilin-3A. Cell 1999, 96(3):363-374.
    111. The Dali Server [http://www.ebi.ac.uk/dali/].
    112. Toms AV, Haas AL, Park JH, Begley TP, Ealick SE: Structural characterization of the regulatory proteins TenA and TenI from Bacillus subtilis and identification of TenA as a thiaminase II. Biochemistry 2005, 44(7):2319-2329.
    113. Chen CC, Han Y, Niu W, Kulakova AN, Howard A, Quinn JP, Dunaway-Mariano D, Herzberg O: Structure and kinetics of phosphonopyruvate hydrolase from Variovorax sp. Pal2: new insight into the divergence of catalysis within the PEP mutase/isocitrate lyase superfamily. Biochemistry 2006, 45(38):11491-11504.
    114. Bateman A, Birney E, Durbin R, Eddy SR, Howe KL, Sonnhammer EL: The Pfam protein families database. Nucleic Acids Res 2000, 28(1):263-266.
    115. Zuckerkandl E, Pauling L: Evolutionary divergence and convergence in proteins. In: Evolving genes and proteins. New York: Academic Press; 1965: 97-166.
    116. Krishna SS, Grishin NV: Structurally analogous proteins do exist! Structure 2004, 12(7):1125-1127.
    117. Theobald DL, Wuttke DS: Divergent evolution within protein superfolds inferred from profile-based phylogenetics. J Mol Biol 2005, 354(3):722-737.
    118. Ausiello G, Peluso D, Via A, Helmer-Citterich M: Local comparison of protein structures highlights cases of convergent evolution in analogous functional sites. BMC Bioinformatics 2007, 8 Suppl 1:S24.
    119. Paszkiewicz KH, Sternberg MJ, Lappe M: Prediction of viable circular permutants using a graph theoretic approach. Bioinformatics 2006, 22(11):1353-1358.
    120. Weiner J, 3rd, Bornberg-Bauer E: Evolution of circular permutations in multidomain proteins. Mol Biol Evol 2006, 23(4):734-743.
    121. Westbrook J, Feng Z, Jain S, Bhat TN, Thanki N, Ravichandran V, Gilliland GL, Bluhm W, Weissig H, Greer DS et al: The Protein Data Bank: unifying the archive. Nucleic Acids Res 2002, 30(1):245-248.
    122. Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, Venger I, Pietrokovski S: Network analysis of protein structures identifies functional residues. J Mol Biol 2004, 344(4):1135-1146.
    123. Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247(4):536-540.
    124. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C et al: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 2004, 32(Database issue):D258-261.
    125. Breitkreutz BJ, Stark C, Tyers M: Osprey: a network visualization system. Genome Biol 2003, 4(3):R22.
    126. Case DA, Cheatham TE, 3rd, Darden T, Gohlke H, Luo R, Merz KM, Jr., Onufriev A, Simmerling C, Wang B, Woods RJ: The Amber biomolecular simulation programs. J Comput Chem 2005, 26(16):1668-1688.
    127. MDL Chime. In.: MDL Information Systems, Inc.
    128. Bennett MJ, Choe S, Eisenberg D: Refined structure of dimeric diphtheria toxin at 2.0 A resolution. Protein Sci 1994, 3(9):1444-1463.
    129. Ogihara NL, Ghirlanda G, Bryson JW, Gingery M, DeGrado WF, Eisenberg D: Design of three-dimensional domain-swapped dimers and fibrous oligomers. Proc Natl Acad Sci U S A 2001, 98(4):1404-1409.
    130. Jaskolski M: 3D domain swapping, protein oligomerization, and amyloid formation. Acta Biochim Pol 2001, 48(4):807-827.
    131. Janowski R, Kozak M, Jankowska E, Grzonka Z, Grubb A, Abrahamson M, Jaskolski M: Human cystatin C, an amyloidogenic protein, dimerizes through three-dimensional domain swapping. Nat Struct Biol 2001, 8(4):316-320.
    132. Knaus KJ, Morillas M, Swietnicki W, Malone M, Surewicz WK, Yee VC: Crystal structure of the human prion protein reveals a mechanism for oligomerization. Nat Struct Biol 2001, 8(9):770-774.
    133. Green SM, Gittis AG, Meeker AK, Lattman EE: One-step evolution of a dimer from a monomeric protein. Nat Struct Biol 1995, 2(9):746-751.
    134. Raag R, Whitlow M: Single-chain Fvs. FASEB J 1995, 9(1):73-80.
    135. Lapatto R, Nalini V, Bax B, Driessen H, Lindley PF, Blundell TL, Slingsby C: High resolution structure of an oligomeric eye lens beta-crystallin. Loops, arches, linkers and interfaces in beta B2 dimer compared to a monomeric gamma-crystallin. J Mol Biol 1991, 222(4):1067-1083.
    136. Trinkl S, Glockshuber R, Jaenicke R: Dimerization of beta B2-crystallin: the role of the linker peptide and the N- and C-terminal extensions. Protein Sci 1994, 3(9):1392-1400.
    137. Dehouck Y, Biot C, Gilis D, Kwasigroch JM, Rooman M: Sequence-structure signals of 3D domain swapping in proteins. J Mol Biol 2003, 330(5):1215-1225.
    138. Ye Y, Godzik A: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2003, 19 Suppl 2:ii246-255.
    139. Bode W, Engh R, Musil D, Thiele U, Huber R, Karshikov A, Brzin J, Kos J, Turk V: The 2.0 A X-ray crystal structure of chicken egg white cystatin and its possible mode of interaction with cysteine proteinases. EMBO J 1988, 7(8):2593-2599.
    140. Picone D, Di Fiore A, Ercole C, Franzese M, Sica F, Tomaselli S, Mazzarella L: The role of the hinge loop in domain swapping. The special case of bovine seminal ribonuclease. J Biol Chem 2005, 280(14):13771-13778.
    141. Hayward S, Berendsen HJ: Systematic analysis of domain motions in proteins from conformational change: new results on citrate synthase and T4 lysozyme. Proteins 1998, 30(2):144-154.
    142. Hayward S: Structural principles governing domain motions in proteins. Proteins 1999, 36(4):425-435.
    143. Hayward S, Lee RA: Improvements in the analysis of domain motions in proteins from conformational change: DynDom version 1.50. J Mol Graph Model 2002, 21(3):181-183.
    144. Lee RA, Razaz M, Hayward S: The DynDom database of protein domain motions. Bioinformatics 2003, 19(10):1290-1291.
    145. Merlino A, Ceruso MA, Vitagliano L, Mazzarella L: Open interface and large quaternary structure movements in 3D domain swapped proteins: insights from molecular dynamics simulations of the C-terminal swapped dimer of ribonuclease A. Biophys J 2005, 88(3):2003-2012.
    146. Mizuno H, Fujimoto Z, Koizumi M, Kano H, Atoda H, Morita T: Structure of coagulation factors IX/X-binding protein, a heterodimer of C-type lectin domains. Nat Struct Biol 1997, 4(6):438-441.
    147. Bennett MJ, Eisenberg D: Refined structure of monomeric diphtheria toxin at 2.3 A resolution. Protein Sci 1994, 3(9):1464-1475.
    148. Albright RA, Mossing MC, Matthews BW: High-resolution structure of an engineered Cro monomer shows changes in conformation relative to the native dimer. Biochemistry 1996, 35(3):735-742.
    149. Anderson WF, Ohlendorf DH, Takeda Y, Matthews BW: Structure of the cro repressor from bacteriophage lambda and its interaction with DNA. Nature 1981, 290(5809):754-758.
    150. Fan RE, Chen PH, Lin CJ: Working Set Selection Using Second Order Information for Training Support Vector Machines. Journal of Machine Learning Research 2005, 6:1889-1918.
    151. Tai CH, Vincent JJ, Kim C, Lee B: SE: an algorithm for deriving sequence alignment from a pair of superimposed structures. BMC Bioinformatics 2009, 10 Suppl 1:S4.
    152. Lin H, Ma X, Chandramohan P, Geist A, Samatova N: Efficient data access for parallel BLAST. In: IEEE International Parallel & Distributed Processing Symposium. Denver, CO; 2005.
    153. Rozwarski DA, Swami BM, Brewer CF, Sacchettini JC: Crystal structure of the lectin from Dioclea grandiflora complexed with core trimannoside of asparagine-linked carbohydrates. J Biol Chem 1998, 273(49):32818-32825.
    154. Velloso LM, Svensson K, Schneider G, Pettersson RF, Lindqvist Y: Crystal structure of the carbohydrate recognition domain of p58/ERGIC-53, a protein involved in glycoprotein export from the endoplasmic reticulum. J Biol Chem 2002, 277(18):15979-15984.
    155. Dijkstra EW: A note on two problems in connection with graphs. Numer Math 1959, 1:269-271.
    156. Beauchamp MA: An improved index of centrality. Behav Sci 1965, 10:161-163.
    157. Sabidussi G: The centrality of a graph. Psychometrika 1966, 31:581-603.
    158. Sacquin-Mora S, Laforet E, Lavery R: Locating the active sites of enzymes using mechanical properties. Proteins 2007, 67(2):350-359.
    159. Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22(12):2577-2637.
    160. Alexandrov NN, Fischer D: Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures. Proteins 1996, 25(3):354-365.
    161. Szustakowski JD, Weng Z: Protein structure alignment using a genetic algorithm. Proteins 2000, 38(4):428-440.
    162. Dror O, Benyamini H, Nussinov R, Wolfson HJ: Multiple structural alignment by secondary structures: algorithm and applications. Protein Sci 2003, 12(11):2492-2507.
    163. Iwakura M, Nakamura T, Yamane C, Maki K: Systematic circular permutation of an entire protein reveals essential folding elements. Nat Struct Biol 2000, 7(7):580-585.
    164. Wilson CG, Magliery TJ, Regan L: Detecting protein-protein interactions with GFP-fragment reassembly. Nat Methods 2004, 1(3):255-262.
    165. Zhu ZY, Sali A, Blundell TL: A variable gap penalty function and feature weights for protein 3-D structure comparisons. Protein Eng 1992, 5(1):43-51.
    166. Madhusudhan MS, Marti-Renom MA, Sanchez R, Sali A: Variable gap penalty for protein sequence-structure alignment. Protein Eng Des Sel 2006, 19(3):129-133.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE