簡易檢索 / 詳目顯示

研究生: 張芫瑜
Chang, Yuan-Yu
論文名稱: 整合蛋白質彈性網路模型與人工智慧方法預測蛋白質動態
Integration of Elastic Network Model and AI/Machine Learning for Protein Dynamics Prediction
指導教授: 楊立威
Yang, Lee-Wei
口試委員: 林澤
Lin, Che
洪瑞鴻
Hung, Jui-Hung
楊進木
Yang, Jinn-Moon
蔡惠旭
Tsai, Hui-Hsu
學位類別: 博士
Doctor
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 171
中文關鍵詞: 彈力網路模型高斯網路模型非勻向網路模型環境非勻向網路模型動態組固有動態域固有動態蛋白動態深度神經網路模型構型集合對接分子動力模擬溫度因子絕對振盪大小頻譜熵剽竊BWT轉換FM索引隱私保護
外文關鍵詞: ENM, GNM, ANM, envANM, DynOmics, Intrinsic dynamics domains, Intrinsic dynamics, Protein dynamics, Deep neural network, Native ensemble, Docking, MD, B-factor, Absolute fluctuations size, Spectral entropy, Plagiarism, BWT Burrows-Wheeler transform, FM index, privacy protection
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 蛋白質動態(Protein dynamics)在蛋白-蛋白、蛋白-小分子交互作用及催化活性等調控扮演極為重要的角色,影響了蛋白巨分子在生物中的功能。蛋白質的序列以及三維結構直接決定了其固有動態(intrinsic dynamics)。彈力網路模型(Elastic network model, ENM)是用於從蛋白質或蛋白複合體的三維結構中解析其動態簡而有力的理論。而高斯網路模型(Gaussian network model, GNM)及非均向網路模型(Anisotropic network model, ANM)是其中兩個廣泛使用的粗粒化彈力網路模型。為了處理現今快速增長的蛋白質結構體(structural proteomics data)資料,我們建立一個具有良好使用者界面的蛋白質動力入口網站「DynOmics」 (http://dynomics.pitt.edu/)。DynOmics搜集整合四個用於從結構快速解析蛋白質動態的理論應用,其中ENM1.0及iGNM2.0是由我們設計並實作。在ENM1.0 (http://dyn.life.nthu.edu.tw/oENM/)中,主要是以ANM、GNM及新開發的envANM分析輸入的蛋白質結構,進而得到該蛋白質的動態資訊。envANM在分析動態時,將分子周遭環境一併考慮進而提升所提取動態的精確度,而這個環境可以是晶格周圍的其他蛋白(crystal contacts)、雙層磷脂質(lipid bilayer)、與該蛋白結合的受質(substrate)或配體(ligand)、多聚體結構中其他次單元。藉由分析以上這些模型所提取出的動態,我們進一步在服務中提供潛在功能位/催化中心分析(potentially functional sites/catalytic sites)、重建ANM預測之構型的全原子模型、濾除低可能性蛋白-蛋白或蛋白-DNA的對接模型(intrinsic dynamics domains, IDDs)等功能性分析。而iGNM 2.0 (https://dyn.life.nthu.edu.tw/gnmdb/)則是DynOmics裡最新的一個收集蛋白質動態的資料庫。該資料庫中的蛋白質動態訊息是我們預先利用GNM計算了大部分收錄在Protein Data Bank蛋白結構(95%)的結果。使用者可以在iGNM 2.0中取得殘基間或domain間的動態相關程度(cross-correlations)、每個運動模式的整體性(collectivity of modes)、蛋白中的樞紐點位(hinge sites)等進階分析結果。iGNM2.0的進階搜尋功能允許使用者依照這些預先分析的蛋白質動態訊息(如擺動亂度vibrational entropy、最大運動整體性largest collectivity等)來篩選想要的蛋白質。iGNM 2.0也提供巨分子蛋白質特別是生物功能性組裝下結構的動態資料,便於讓使用者了解蛋白結構、動態與功能的相互關係。
    然而前述的彈力網路模型卻無法直接預估出蛋白質構型改變的絕對大小。為了解決這個問題,我們嘗試建立一個從蛋白質結構可高效預測每個殘基擺動絕對大小的深度神經網路模型(deep-neural-network, DNN)。該神經網路模型的訓練輸入包含了蛋白質的結構、動力學與化學特徵,而訓練目標則為從所有該蛋白質X-ray結構的平均B-factor換算而來的殘基擾動大小。這個模型預測6450個蛋白質的絕對擾動大小時,誤差僅有24% (平均絕對誤差為0.17 Å);而預測12個有分子動力模擬估算的擾動大小是也僅有44%的誤差(平均絕對誤差為0.29Å),相較於目前已知同目的最好軟體之143%。藉由分析模型的參數,我們發現39個特徵中其中與蛋白質形狀及正則模態密度頻譜熵(spectral entropy)相關的特徵是主要影響預測結果因子。利用神經網路模型預測的擾動大小及ENM模型給出殘基運動方向,可以由結合前結構推算結合後的結構。我們利用該結構作為蛋白-蛋白分子對接軟體的輸入,可以順利提高與真實結合方式相似對接結果的產出。
    此外,我們以當代的生物資訊演算法為核心,開發了一個具有隱私功能剽竊偵測服務-「Sapiens Aperio Veritas Engine」 (S.A.V.E.),該服務包含文件間比對或文件與網路資料間的比對。該核心整合了FM-index、Smith-Waterman 動態規劃及似BLAST索引算法,皆為現今常用與序列比對的算法。我們的剽竊比對方法在待測文件及比對目標資料庫皆被破壞性加密的狀況下,依然適用。S.A.V.E.的這個特性讓它在避免高機密文件洩露的情況下,還可以維持剽竊偵測精準。


    Protein dynamics is essential to regulate protein-protein interaction, protein-ligand interaction and enzyme activity etc., which in turn affect the functions and efficiency of biological machines. Intrinsic dynamics of proteins are encoded in their sequence and structure. Elastic network model (ENM) is a simple yet powerful model to decode the dynamics of proteins and their complexes from their 3-dimensional structures. Gaussian network model (GNM) and Anisotropic network model (ANM) are two of the most popular coarse-grained ENM modes.
    We build a user-friendly interface web portal named “DynOmics” (http://dynomics.pitt.edu/), which collects the applications that are designed to efficiently and accurately evaluate the dynamics of structurally resolved system for growing structural proteomics data. DynOmics allows users to access the conformational dynamics of biomolecular structures easily. There are currently 4 applications available on DynOmics and we contributed 2 of them, including ENM1.0 and iGNM 2.0.
    ENM1.0 (http://dyn.life.nthu.edu.tw/oENM/) takes protein structure as the input and performs analysis on its dynamic information by using ANM, GNM and newly developed envANM, which also considers the molecular environment. By performing further advance analysis on the results provided by the models, we provide several types of output for users, including potentially functional sites, the extent of allosteric communication mechanisms, conformers reconstructed at atomic level by ANM predicted motions, protein-protein and protein-DNA interaction poses filter (intrinsic dynamics domains, IDDs). The definition of the ‘molecular environment’ of the envANM is wide enough to include crystal contacts of proteins in the crystalline environment, lipid bilayer, the substrate or ligands bound to a protein and surrounding subunits to a subunit of interest in a multimeric structure or assembly.
    iGNM2.0 (https://dyn.life.nthu.edu.tw/gnmdb/) is a protein dynamic database which collects the dynamic information of more than 95% of the protein structures available on PDB. The dynamic information available on iGNM2.0 are pre-calculated by using GNM. Users can retrieve the dynamics information on inter-residue and inter-domain cross-correlations, cooperative modes of motion, the location of hinge sites from the server with plain text and customized 2D/3D visualization capabilities. The collections of protein dynamic on iGNM2.0 also allows users to carry out advance search by using protein dynamics features such as vibrational entropy, largest collectivity and more. The ability of iGNM 2.0 to provide structural dynamics data on large protein structures, especially their biological assemblies, makes it a powerful resource to establish the relation between their structure, dynamics and function.
    The aforementioned ENMs however cannot provide the absolute size of conformational changes despite suggest their directionality. We further develop a deep-neural-network (DNN) prediction model that can efficiently estimate the absolute size of protein dynamics at the residue resolution, given a protein structure. The training features of this model includes the dynamics, structural and chemical properties of a protein structure while the training target is the fluctuation size obtained by the temperature factors averaged from x-ray crystallographic structures of the same family. The model is able to reproduce the experimentally characterized absolute fluctuations for 6450 proteins with 24% errors (Mean Absolute Error or MAE = 0.17Å) and also predicts MD-sampled per-residue fluctuations for 12 tested systems with a 44% error (MAE = 0.29Å). The most essential features in determining the absolute sizes of residue fluctuations are protein shape and spectral entropy derived from normal mode density of states. We are able to improve the enrichment of native poses in the protein-protein docking decoys relative to the poses generated using only the apo form structures by introducing both the AI predicted size of residues’ fluctuation and ENM-suggested deformation direction which can be obtained from DynOmics.
    On the other hand, integrating state-of-the-art bioinformatics algorithms, we developed a plagiarism-detection software, “Sapiens Aperio Veritas Engine” (S.A.V.E.), which integrates FM-index, Smith-Waterman dynamics programming and BLAST-like indexing algorithms to detect copies of paragraphs in pairwise comparisons and/or across the internet. The algorithm works even when the text (in both the target database and queries) is destructively encrypted and compressed, so instead of searching online with queries in plain text, the sensitive text can all be protected locally, by which S.A.V.E. can achieve the highest privacy protection and the best search efficiency.

    English Abstract i 中文摘要 iv 致謝 vi Table of Contents 1 List of Figures 4 List of Tables 6 Chapter 1. iGNM 2.0: the Gaussian network model database for biomolecular structural dynamics 10 1.1. Introduction 10 1.2. Methods 14 1.2.1. Gaussian network model and mode spectral 14 1.2.2. Data set 17 1.2.3. Inputs: query and searching functions 20 1.2.4. Details about the iGNM 2.0 visualization 20 1.3. Results 22 1.3.1. X-ray crystallographic B-factors (3D/2D). 22 1.3.2. Mode shapes (3D/2D). 27 1.3.3. Domain separations by dynamics (3D/2D). 30 1.3.4. GNM connectivity model (3D/2D) 30 1.3.5. Cross-correlations (3D/2D) 30 1.3.6. Collectivity (2D) 31 1.3.7. Results in plain text. 31 1.3.8. Database architecture of iGNM 2.0 31 1.4. Discussion and Conclusion 34 Chapter 2. DynOmics: dynamics of structural proteome and Beyond 44 2.1. Introduction 44 2.2. Methods 47 2.2.1. Environment ANM/GNM (envANM/envGNM) 47 2.2.2. Potential functional sites (PFSs) prediction: Conformational-mobility-based prediction of enzyme active sites (COMPACT) algorithm 48 2.2.3. Intrinsic dynamics domains (IDDs) 52 2.2.4. Construction of full-atomic structures from ANM-Driven conformers after regularization of C-C pseudo-bonds 55 2.2.5. Hitting/commute time 59 2.2.6. Perturbation-response scanning (PRS) 60 2.3. Results 61 2.3.1. Description of web server 61 2.3.2. Effect of environment 65 2.3.3. Identification of functional sites 73 2.4. Conclusion 81 Chapter 3. Deep Neural Network that Predicts Protein Dynamics in the Context of Facilitating Flexible Protein Docking 82 3.1. Introduction 82 3.2. Material and Methods 85 3.2.1. Dataset 85 3.2.2. Definition of fluctuation 87 3.2.3. Molecular dynamic (MD) simulations 88 3.2.4. Features 89 3.2.5. Spectral entropy (SE) derived from the distribution of GNM mode eigenvalues (frequency square) 91 3.2.6. Shape factor (SF) 92 3.2.7. Optimization of hyperparameters in DNNs 93 3.2.8. Extended Garson's algorithm 95 3.2.9. Docking protocol 97 3.3. Results 100 3.3.1. Performance Comparison between different ML Models 100 3.3.2. Performance of DNN model 100 3.3.3. Applications 104 3.4. Discussions 107 3.4.1. Importance features 107 3.4.2. The performance of DNN-B model 110 3.4.3. Why use the average B-factor RMSF instead of using the B-factor from individual experiment directly? 110 Chapter 4. SAVE - A Plagiarism Detection Tool with Enhanced Privacy Protection 113 4.1. Introduction 113 4.2. Methods 115 4.2.1. Text extraction from documents 115 4.2.2. Reference removing 115 4.2.3. Word extraction 119 4.2.4. Text cleansing 121 4.2.5. Encoding readable words into pseudo-biological sequences (PBS) 121 4.2.6. The central idea of in-private search 124 4.2.7. Burrows-Wheeler transform and FM index (this program was built by Prof. Jui-Hung Hung/洪瑞鴻’s group) 124 4.2.8. Building the PBS database 139 4.2.9. Mutual plagiarism detection with privacy 141 4.2.10. Plagiarism detection against the Internet 143 4.2.11. Mutual plagiarism detection for plain texts 143 4.2.12. Number of continuously copied words 145 4.2.13. False positive rate 147 4.3. Results 148 4.3.1. SAVE website 148 4.3.2. SAVE_APP 150 4.3.3. Private + Internet mode 152 4.3.4. Private + Pairwise mode 155 4.3.5. Non-private + Internet mode 158 4.3.6. Non-private + Pairwise mode 160 4.3.7. False positive rate of a plagiarism detection with privacy 161 4.3.8. Browser compatibility of the SAVE website 161 4.4. Discussion 162 4.4.1. Advantages of encoding text into PBSs 162 References 163

    1 Bahar, I., Cheng, M. H., Lee, J. Y., Kaya, C. & Zhang, S. Structure-Encoded Global Motions and Their Role in Mediating Protein-Substrate Interactions. Biophys J 109, 1101-1109, doi:10.1016/j.bpj.2015.06.004 (2015).
    2 Bahar, I., Lezon, T. R., Yang, L. W. & Eyal, E. Global dynamics of proteins: bridging between structure and function. Annu. Rev. Biophys. 39, 23-42, doi:10.1146/annurev.biophys.093008.131258 (2010).
    3 Dobbins, S. E., Lesk, V. I. & Sternberg, M. J. Insights into protein flexibility: The relationship between normal modes and conformational change upon protein-protein docking. Proc. Natl Acad. Sci. USA 105, 10390-10395, doi:10.1073/pnas.0802496105 (2008).
    4 Tobi, D. & Bahar, I. Structural changes involved in protein binding correlate with intrinsic motions of proteins in the unbound state. Proc. Natl Acad. Sci. USA 102, 18908-18913, doi:10.1073/pnas.0507603102 (2005).
    5 Bakan, A. & Bahar, I. The intrinsic dynamics of enzymes plays a dominant role in determining the structural changes induced upon inhibitor binding. Proc. Natl Acad. Sci. USA 106, 14349-14354, doi:10.1073/pnas.0904214106 (2009).
    6 Haliloglu, T. & Bahar, I. Adaptability of protein structures to enable functional interactions and evolutionary implications. Curr Opin Struct Biol 35, 17-23, doi:10.1016/j.sbi.2015.07.007 (2015).
    7 Haliloglu, T., Bahar, I. & Erman, B. Gaussian dynamics of folded proteins. Phys. Rev. Lett. 79, 3090-3093, doi:10.1103/PhysRevLett.79.3090 (1997).
    8 Yang, L. W. et al. iGNM: a database of protein functional motions based on Gaussian Network Model. Bioinformatics 21, 2978-2987, doi:10.1093/bioinformatics/bti469 (2005).
    9 Yang, L. W. et al. oGNM: online computation of structural dynamics using the Gaussian Network Model. Nucleic Acids Res. 34, W24-31, doi:10.1093/nar/gkl084 (2006).
    10 Knowles, T. P. et al. Role of intermolecular forces in defining material properties of protein nanofibrils. Science 318, 1900-1903, doi:10.1126/science.1150057 (2007).
    11 Reuveni, S., Granek, R. & Klafter, J. Proteins: coexistence of stability and flexibility. Phys Rev Lett 100, 208101, doi:10.1103/PhysRevLett.100.208101 (2008).
    12 Zimmermann, M. T., Leelananda, S. P., Kloczkowski, A. & Jernigan, R. L. Combining statistical potentials with dynamics-based entropies improves selection from protein decoys and docking poses. J Phys Chem B 116, 6725-6731, doi:10.1021/jp2120143 (2012).
    13 Yang, L. W. & Bahar, I. Coupling between catalytic site and collective dynamics: A requirement for mechanochemical activity of enzymes. Structure 13, 893-904, doi:10.1016/j.str.2005.03.015 (2005).
    14 Li, H., Sakuraba, S., Chandrasekaran, A. & Yang, L. W. Molecular binding sites are located near the interface of intrinsic dynamics domains (IDDs). Journal of chemical information and modeling 54, 2275-2285, doi:10.1021/ci500261z (2014).
    15 Flory, P. J. & Volkenstein, M. Statistical mechanics of chain molecules. Biopolymers 8, 699-700, doi:10.1002/bip.1969.360080514 (1969).
    16 Fuglebakk, E., Tiwari, S. P. & Reuter, N. Comparing the intrinsic dynamics of multiple protein structures using elastic network models. Biochim Biophys Acta 1850, 911-922, doi:10.1016/j.bbagen.2014.09.021 (2015).
    17 Leioatts, N., Romo, T. D. & Grossfield, A. Elastic Network Models are Robust to Variations in Formalism. J Chem Theory Comput 8, 2424-2434, doi:10.1021/ct3000316 (2012).
    18 Hinsen, K. & Kneller, G. A simplified force field for describing vibrational protein dynamics over the whole frequency range. Vol. 111 (1999).
    19 Kitao, A. & Go, N. Investigating protein dynamics in collective coordinate space. Curr Opin Struct Biol 9, 164-169, doi:10.1016/s0959-440x(99)80023-2 (1999).
    20 Tama, F. & Sanejouand, Y. H. Conformational change of proteins arising from normal mode calculations. Protein Engineering, Design and Selection 14, 1-6, doi:10.1093/protein/14.1.1 (2001).
    21 Bahar, I., Atilgan, A. R., Demirel, M. C. & Erman, B. Vibrational dynamics of folded proteins: Significance of slow and fast motions in relation to function and stability. Phys. Rev. Lett. 80, 2733-2736, doi:10.1103/PhysRevLett.80.2733 (1998).
    22 Rader, A. J. et al. Identification of core amino acids stabilizing rhodopsin. Proc. Natl Acad. Sci. USA 101, 7246-7251, doi:10.1073/pnas.0401429101 (2004).
    23 Bahar, I. On the functional significance of soft modes predicted by coarse-grained models for membrane proteins. The Journal of general physiology 135, 563-573, doi:10.1085/jgp.200910368 (2010).
    24 Kundu, S., Melton, J. S., Sorensen, D. C. & Phillips, G. N. Dynamics of proteins in crystals: Comparison of experiment with simple models. Biophys. J. 83, 723-732 (2002).
    25 Bahar, I., Wallqvist, A., Covell, D. G. & Jernigan, R. L. Correlation between native-state hydrogen exchange and cooperative residue fluctuations from a simple model. Biochemistry 37, 1067-1075, doi:10.1021/bi9720641 (1998).
    26 Yang, L. W. et al. Insights into equilibrium dynamics of proteins from comparison of NMR and X-ray data with computational predictions. Structure 15, 741-749, doi:10.1016/j.str.2007.04.014 (2007).
    27 Yang, L. W., Eyal, E., Bahar, I. & Kitao, A. Principal component analysis of native ensembles of biomolecular structures (PCA_NEST): insights into functional dynamics. Bioinformatics 25, 606-614, doi:10.1093/bioinformatics/btp023 (2009).
    28 Yang, L., Song, G., Carriquiry, A. & Jernigan, R. L. Close correspondence between the motions from principal component analysis of multiple HIV-1 protease structures and elastic network modes. Structure 16, 321-330, doi:10.1016/j.str.2007.12.011 (2008).
    29 Zimmermann, M. T. & Jernigan, R. L. Elastic network models capture the motions apparent within ensembles of RNA structures. Rna 20, 792-804, doi:10.1261/rna.041269.113 (2014).
    30 Pinamonti, G., Bottaro, S., Micheletti, C. & Bussi, G. Elastic network models for RNA: a comparative assessment with molecular dynamics and SHAPE experiments. Nucleic Acids Res. 43, 7260-7269, doi:10.1093/nar/gkv708 (2015).
    31 Emekli, U., Schneidman-Duhovny, D., Wolfson, H. J., Nussinov, R. & Haliloglu, T. HingeProt: automated prediction of hinges in protein structures. Proteins: Struct., Funct., Bioinf. 70, 1219-1227, doi:10.1002/prot.21613 (2008).
    32 Keating, K. S., Flores, S. C., Gerstein, M. B. & Kuhn, L. A. StoneHinge: hinge prediction by network analysis of individual protein structures. Protein science : a publication of the Protein Society 18, 359-371, doi:10.1002/pro.38 (2009).
    33 Echols, N., Milburn, D. & Gerstein, M. MolMovDB: analysis and visualization of conformational change and structural flexibility. Nucleic Acids Res. 31, 478-482, doi:10.1093/nar/gkg104 (2003).
    34 Krebs, W. G. et al. Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic. Proteins: Struct., Funct., Bioinf. 48, 682-695, doi:10.1002/prot.10168 (2002).
    35 Suhre, K. & Sanejouand, Y. H. ElNemo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement. Nucleic Acids Res. 32, W610-614, doi:10.1093/nar/gkh368 (2004).
    36 Wako, H., Kato, M. & Endo, S. ProMode: a database of normal mode analyses on protein molecules with a full-atom model. Bioinformatics 20, 2035-2043, doi:10.1093/bioinformatics/bth197 (2004).
    37 Hollup, S. M., Salensminde, G. & Reuter, N. WEBnm@: a web application for normal mode analyses of proteins. BMC Bioinformatics 6, 52, doi:10.1186/1471-2105-6-52 (2005).
    38 Lindahl, E., Azuara, C., Koehl, P. & Delarue, M. NOMAD-Ref: visualization, deformation and refinement of macromolecular structures based on all-atom normal mode analysis. Nucleic Acids Res. 34, W52-56, doi:10.1093/nar/gkl082 (2006).
    39 Lopez-Blanco, J. R., Garzon, J. I. & Chacon, P. iMod: multipurpose normal mode analysis in internal coordinates. Bioinformatics 27, 2843-2850, doi:10.1093/bioinformatics/btr497 (2011).
    40 Seo, S. & Kim, M. K. KOSMOS: a universal morph server for nucleic acids, proteins and their complexes. Nucleic Acids Res. 40, W531-536, doi:10.1093/nar/gks525 (2012).
    41 Wako, H. & Endo, S. Normal mode analysis based on an elastic network model for biomolecules in the Protein Data Bank, which uses dihedral angles as independent variables. Computational biology and chemistry 44, 22-30, doi:10.1016/j.compbiolchem.2013.02.006 (2013).
    42 Lopez-Blanco, J. R., Aliaga, J. I., Quintana-Orti, E. S. & Chacon, P. iMODS: internal coordinates normal mode analysis server. Nucleic Acids Res. 42, W271-276, doi:10.1093/nar/gku339 (2014).
    43 Eyal, E., Lum, G. & Bahar, I. The anisotropic network model web server at 2015 (ANM 2.0). Bioinformatics 31, 1487-1489, doi:10.1093/bioinformatics/btu847 (2015).
    44 Frappier, V., Chartier, M. & Najmanovich, R. J. ENCoM server: exploring protein conformational space and the effect of mutations on protein function and stability. Nucleic Acids Res. 43, W395-400, doi:10.1093/nar/gkv343 (2015).
    45 Li, H., Chang, Y. Y., Yang, L. W. & Bahar, I. iGNM 2.0: the Gaussian network model database for biomolecular structural dynamics. Nucleic Acids Res. 44, D415-422, doi:10.1093/nar/gkv1236 (2016).
    46 Rose, P. W. et al. The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res. 43, D345-356, doi:10.1093/nar/gku1214 (2015).
    47 Yang, L.-W. Models with energy penalty on interresidue rotation address insufficiencies of conventional elastic network models. Biophys. J. 100, 1784-1793, doi:10.1016/j.bpj.2011.02.033 (2011).
    48 Hanson, R. M., Prilusky, J., Renjian, Z., Nakane, T. & Sussman, J. L. JSmol and the Next-Generation Web-Based Representation of 3D Molecular Structure as Applied to Proteopedia. Israel Journal of Chemistry 53, 207-216, doi:10.1002/ijch.201300024 (2013).
    49 M. Hanson, R. Jmol—A paradigm shift in crystallographic visualization. Vol. 43 (2010).
    50 Zhu, C. & Yi, C. Switching demethylation activities between AlkB family RNA/DNA demethylases through exchange of active-site residues. Angew Chem Int Ed Engl 53, 3659-3662, doi:10.1002/anie.201310050 (2014).
    51 Prompers, J. J., Lienin, S. F. & Bruschweiler, R. Collective reorientational motion and nuclear spin relaxation in proteins. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 79-88 (2001).
    52 Kondrashov, D. A., Cui, Q. & Phillips, G. N. Optimization and evaluation of a coarse-grained model of protein motion using x-ray crystal data. Biophys. J. 91, 2760-2767, doi:10.1529/biophysj.106.085894 (2006).
    53 Li, D. W. & Bruschweiler, R. All-atom contact model for understanding protein dynamics from crystallographic B-factors. Biophys. J. 96, 3074-3081, doi:10.1016/j.bpj.2009.01.011 (2009).
    54 Liu, L., Koharudin, L. M., Gronenborn, A. M. & Bahar, I. A comparative analysis of the equilibrium dynamics of a designed protein inferred from NMR, X-ray, and computations. Proteins: Struct., Funct., Bioinf. 77, 927-939, doi:10.1002/prot.22518 (2009).
    55 Setny, P. & Zacharias, M. Elastic Network Models of Nucleic Acids Flexibility. J. Chem. Theory Comput. 9, 5460-5470, doi:10.1021/ct400814n (2013).
    56 Chandrasekaran, A., Chan, J., Lim, C. & Yang, L.-W. Protein Dynamics and Contact Topology Reveal Protein–DNA Binding Orientation. J. Chem. Theory Comput. 12, 5269-5277, doi:10.1021/acs.jctc.6b00688 (2016).
    57 Chennubhotla, C. & Bahar, I. Signal propagation in proteins and relation to equilibrium fluctuations. PLoS Comput Biol 3, 1716-1726, doi:10.1371/journal.pcbi.0030172 (2007).
    58 Atilgan, C. & Atilgan, A. R. Perturbation-response scanning reveals ligand entry-exit mechanisms of ferric binding protein. PLoS Comput Biol 5, e1000544, doi:10.1371/journal.pcbi.1000544 (2009).
    59 General, I. J. et al. ATPase subdomain IA is a mediator of interdomain allostery in Hsp70 molecular chaperones. PLoS Comput Biol 10, e1003624, doi:10.1371/journal.pcbi.1003624 (2014).
    60 Zimmermann, M. T., Kloczkowski, A. & Jernigan, R. L. MAVENs: motion analysis and visualization of elastic networks and structural ensembles. BMC Bioinformatics 12, 264, doi:10.1186/1471-2105-12-264 (2011).
    61 Dokholyan, N. V. Controlling Allosteric Networks in Proteins. Chem Rev 116, 6463-6487, doi:10.1021/acs.chemrev.5b00544 (2016).
    62 Yang, L. W., Kitao, A., Huang, B. C. & Go, N. Ligand-induced protein responses and mechanical signal propagation described by linear response theories. Biophys. J. 107, 1415-1425, doi:10.1016/j.bpj.2014.07.049 (2014).
    63 Bakan, A. et al. Evol and ProDy for bridging protein sequence evolution and structural dynamics. Bioinformatics 30, 2681-2683, doi:10.1093/bioinformatics/btu336 (2014).
    64 Atilgan, A. R. et al. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 80, 505-515, doi:10.1016/S0006-3495(01)76033-X (2001).
    65 Ming, D. & Wall, M. E. Allostery in a coarse-grained model of protein dynamics. Phys. Rev. Lett. 95, 198103, doi:10.1103/PhysRevLett.95.198103 (2005).
    66 Bahar, I., Lezon, T. R., Bakan, A. & Shrivastava, I. H. Normal mode analysis of biomolecular structures: functional mechanisms of membrane proteins. Chem Rev 110, 1463-1497, doi:Doi 10.1021/Cr900095e (2010).
    67 Lezon, T. R. & Bahar, I. Constraints imposed by the membrane selectively guide the alternating access dynamics of the glutamate transporter GltPh. Biophys J 102, 1331-1340, doi:10.1016/j.bpj.2012.02.028 (2012).
    68 Gutteridge, A., Bartlett, G. J. & Thornton, J. M. Using a neural network and spatial clustering to predict the location of active sites in enzymes. J. Mol. Biol. 330, 719-734, doi:10.1016/s0022-2836(03)00515-1 (2003).
    69 Bartlett, G. J., Porter, C. T., Borkakoti, N. & Thornton, J. M. Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 324, 105-121, doi:10.1016/s0022-2836(02)01036-7 (2002).
    70 Chandrasekaran, A., Chan, J., Lim, C. & Yang, L. W. Protein Dynamics and Contact Topology Reveal Protein-DNA Binding Orientation. J Chem Theory Comput 12, 5269-5277, doi:10.1021/acs.jctc.6b00688 (2016).
    71 Lu, M. & Ma, J. Normal mode analysis with molecular geometry restraints: bridging molecular mechanics and elastic models. Arch Biochem Biophys 508, 64-71, doi:10.1016/j.abb.2010.12.031 (2011).
    72 Flory, P. J., Gordon, M., Flory, P. J. & McCrum, N. G. Statistical thermodynamics of random networks. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences 351, 351-380, doi:10.1098/rspa.1976.0146 (1976).
    73 Humphrey, W., Dalke, A. & Schulten, K. VMD - Visual Molecular Dynamics. J. Molec. Graphics 14, 33-38 (1996).
    74 Altshuler, D. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061-1073, doi:Doi 10.1038/Nature09534 (2010).
    75 Best, R. B. et al. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone phi, psi and side-chain chi(1) and chi(2) dihedral angles. J Chem Theory Comput 8, 3257-3273, doi:10.1021/ct300400x (2012).
    76 Eyal, E., Chennubhotla, C., Yang, L. W. & Bahar, I. Anisotropic fluctuations of amino acids in protein structures: insights from X-ray crystallography and elastic network models. Bioinformatics 23, I175-I184, doi:DOI 10.1093/bioinformatics/btm186 (2007).
    77 Takeo, K. et al. Allosteric regulation of gamma-secretase activity by a phenylimidazole-type gamma-secretase modulator. Proc. Natl Acad. Sci. USA 111, 10544-10549, doi:10.1073/pnas.1402171111 (2014).
    78 Dutta, A. et al. Cooperative Dynamics of Intact AMPA and NMDA Glutamate Receptors: Similarities and Subfamily-Specific Differences. Structure 23, 1692-1704, doi:10.1016/j.str.2015.07.002 (2015).
    79 Porter, C. T., Bartlett, G. J. & Thornton, J. M. The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res. 32, D129-133, doi:10.1093/nar/gkh028 (2004).
    80 Goldenberg, O., Erez, E., Nimrod, G. & Ben-Tal, N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. 37, D323-327, doi:10.1093/nar/gkn822 (2009).
    81 Liu, Y. & Bahar, I. Sequence evolution correlates with structural dynamics. Mol Biol Evol 29, 2253-2263, doi:10.1093/molbev/mss097 (2012).
    82 Satoo, K. et al. The structure of Atg4B-LC3 complex reveals the mechanism of LC3 processing and delipidation during autophagy. Embo j 28, 1341-1350, doi:10.1038/emboj.2009.80 (2009).
    83 Liu, P. F. et al. Drug Repurposing Screening Identifies Tioconazole as an ATG4 Inhibitor that Suppresses Autophagy and Sensitizes Cancer Cells to Chemotherapy. Theranostics 8, 830-845, doi:10.7150/thno.22012 (2018).
    84 Andrusier, N., Mashiach, E., Nussinov, R. & Wolfson, H. J. Principles of flexible protein-protein docking. Proteins: Struct., Funct., Bioinf. 73, 271-289, doi:10.1002/prot.22170 (2008).
    85 Li, H., Chang, Y. Y., Lee, J. Y., Bahar, I. & Yang, L. W. DynOmics: dynamics of structural proteome and beyond. Nucleic Acids Res., doi:10.1093/nar/gkx385 (2017).
    86 Guo, J. & Zhou, H. X. Protein Allostery and Conformational Dynamics. Chem Rev 116, 6503-6515, doi:10.1021/acs.chemrev.5b00590 (2016).
    87 Popovych, N., Sun, S., Ebright, R. H. & Kalodimos, C. G. Dynamically driven protein allostery. Nat. Struct. Mol. Biol. 13, 831-838, doi:10.1038/nsmb1132 (2006).
    88 Chang, K. C., Salawu, E. O., Chang, Y. Y., Wen, J. D. & Yang, L. W. Resolution-exchanged structural modeling and simulations jointly unravel that subunit rolling underlies the mechanism of programmed ribosomal frameshifting. Bioinformatics 35, 945-952, doi:10.1093/bioinformatics/bty762 (2019).
    89 Eyal, E., Yang, L. W. & Bahar, I. Anisotropic network model: systematic evaluation and a new web interface. Bioinformatics 22, 2619-2627, doi:10.1093/bioinformatics/btl448 (2006).
    90 Lin, J. J., Lin, Z. L., Hwang, J. K. & Huang, T. T. On the packing density of the unbound protein-protein interaction interface and its implications in dynamics. BMC Bioinformatics 16 Suppl 1, S7, doi:10.1186/1471-2105-16-s1-s7 (2015).
    91 Halle, B. Flexibility and packing in proteins. Proc. Natl Acad. Sci. USA 99, 1274, doi:10.1073/pnas.032522499 (2002).
    92 Doruker, P., Atilgan, A. R. & Bahar, I. Dynamics of proteins predicted by molecular dynamics simulations and analytical approaches: application to alpha-amylase inhibitor. Proteins: Struct., Funct., Bioinf. 40, 512-524 (2000).
    93 Chandrasekhar, I., Clore, G. M., Szabo, A., Gronenborn, A. M. & Brooks, B. R. A 500 ps molecular dynamics simulation study of interleukin-1β in water: Correlation with nuclear magnetic resonance spectroscopy and crystallography. J. Mol. Biol. 226, 239-250, doi:https://doi.org/10.1016/0022-2836(92)90136-8 (1992).
    94 McRee, D. E. in Practical Protein Crystallography (Second Edition) (ed Duncan E. McRee) 329-cp322 (Academic Press, 1999).
    95 Kidera, A. & Go, N. Normal mode refinement: crystallographic refinement of protein dynamic structure. I. Theory and test by simulated diffraction data. J. Mol. Biol. 225, 457-475 (1992).
    96 Vihinen, M., Torkkila, E. & Riikonen, P. Accuracy of protein flexibility predictions. Proteins: Struct., Funct., Bioinf. 19, 141-149, doi:10.1002/prot.340190207 (1994).
    97 Smith, D. K., Radivojac, P., Obradovic, Z., Dunker, A. K. & Zhu, G. Improved amino acid flexibility parameters. Protein science : a publication of the Protein Society 12, 1060-1072, doi:10.1110/ps.0236203 (2003).
    98 Schlessinger, A. & Rost, B. Protein flexibility and rigidity predicted from sequence. Proteins: Struct., Funct., Bioinf. 61, 115-126, doi:10.1002/prot.20587 (2005).
    99 Yuan, Z., Bailey, T. L. & Teasdale, R. D. Prediction of protein B-factor profiles. Proteins: Struct., Funct., Bioinf. 58, 905-912, doi:10.1002/prot.20375 (2005).
    100 Pan, X. Y. & Shen, H. B. Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection. Protein Pept Lett 16, 1447-1454 (2009).
    101 Jing, R. A Research of Predicting the B-factor Base on the Protein Sequence. Vol. 3 (2014).
    102 Yang, J., Wang, Y. & Zhang, Y. ResQ: An Approach to Unified Estimation of B-Factor and Residue-Specific Error in Protein Structure Prediction. J. Mol. Biol. 428, 693-701, doi:10.1016/j.jmb.2015.09.024 (2016).
    103 Haliloglu, T. & Bahar, I. Structure-based analysis of protein dynamics: comparison of theoretical results for hen lysozyme with X-ray diffraction and NMR relaxation data. Proteins: Struct., Funct., Bioinf. 37, 654-667 (1999).
    104 Yang, L. W. Models with energy penalty on interresidue rotation address insufficiencies of conventional elastic network models. Biophys J 100, 1784-1793, doi:10.1016/j.bpj.2011.02.033 (2011).
    105 Peterson, L., Jamroz, M., Kolinski, A. & Kihara, D. Predicting Real-Valued Protein Residue Fluctuation Using FlexPred. Methods Mol Biol 1484, 175-186, doi:10.1007/978-1-4939-6406-2_13 (2017).
    106 Dauber-Osguthorpe, P., Osguthorpe, D. J., Stern, P. S. & Moult, J. Low Frequency Motion in Proteins: Comparison of Normal Mode and Molecular Dynamics of Streptomyces Griseus Protease A. Journal of Computational Physics 151, 169-189, doi:https://doi.org/10.1006/jcph.1999.6232 (1999).
    107 Dolinsky, T. J., Nielsen, J. E., McCammon, J. A. & Baker, N. A. PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res. 32, W665-667, doi:10.1093/nar/gkh381 (2004).
    108 Pearlman, D. A. et al. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comput Phys Commun 91, 1-41, doi:https://doi.org/10.1016/0010-4655(95)00041-D (1995).
    109 Yang, Z. R., Thomson, R., McNeil, P. & Esnouf, R. M. RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21, 3369-3376, doi:10.1093/bioinformatics/bti534 (2005).
    110 Olsson, M. H., Sondergaard, C. R., Rostkowski, M. & Jensen, J. H. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J Chem Theory Comput 7, 525-537, doi:10.1021/ct100578z (2011).
    111 NACCESS v. 2.1.1 (Department of Biochemistry and Molecular Biology, University College London., 1993).
    112 Sylvester, J. J. XIX. A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 4, 138-142, doi:10.1080/14786445208647087 (1852).
    113 Ioffe, S. & Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. eprint arXiv:1502.03167, arXiv:1502.03167 (2015).
    114 Garson, G. D. Interpreting neural-network connection weights. AI Expert 6, 46-51 (1991).
    115 Keskin, O. Binding induced conformational changes of proteins correlate with their intrinsic fluctuations: a case study of antibodies. BMC structural biology 7, 31, doi:10.1186/1472-6807-7-31 (2007).
    116 Chen, R., Li, L. & Weng, Z. ZDOCK: an initial-stage protein-docking algorithm. Proteins: Struct., Funct., Bioinf. 52, 80-87, doi:10.1002/prot.10389 (2003).
    117 Mintseris, J. et al. Integrating statistical pair potentials into protein complex prediction. Proteins: Structure, Function, and Bioinformatics 69, 511-520, doi:10.1002/prot.21502 (2007).
    118 Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. eprint arXiv:1603.02754, arXiv:1603.02754 (2016).
    119 Hwang, H., Vreven, T., Janin, J. & Weng, Z. Protein-protein docking benchmark version 4.0. Proteins: Struct., Funct., Bioinf. 78, 3111-3114, doi:10.1002/prot.22830 (2010).
    120 Jamroz, M., Kolinski, A. & Kihara, D. Structural features that predict real-value fluctuations of globular proteins. Proteins: Struct., Funct., Bioinf. 80, 1425-1435, doi:10.1002/prot.24040 (2012).
    121 Chang, C. H. et al. sBWT: memory efficient implementation of the hardware-acceleration-friendly Schindler transform for the fast biological sequence mapping. Bioinformatics 32, 3498-3500, doi:10.1093/bioinformatics/btw419 (2016).

    QR CODE