簡易檢索 / 詳目顯示

研究生: 羅光倫
Lo, Allan
論文名稱: Predicting the structural characteristics of membrane proteins by computational approaches
以機器學習計算方法預測膜蛋白結構特性
指導教授: 許聞廉
Hsu, Wen-Lian
呂平江
Lyu, Ping-Chiang
口試委員:
學位類別: 博士
Doctor
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 80
中文關鍵詞: 膜蛋白結構預測機器學習穿膜螺旋生物資訊
外文關鍵詞: membrane protein, structure prediction, machine learning, transmembrane helix, bioinformatics
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • This thesis consists of several works that are related to predicting the structural characteristics of membrane proteins from sequence using machine learning methods. Taken together, these predicted structural features are important modules towards ab initio modeling of membrane proteins structures. First, a membrane topology prediction method, SVMtop, was developed using support vector machines. A novel topology scoring function was proposed and SVMtop improves current state-of-the-art approaches by achieving over 70% in accuracy for correctly predicting both the locations of transmembrane (TM) helices and sidedness in standard benchmarks. Building upon this work, TMhit was developed to predict helix-helix interactions from residue contacts. We calculated statistical propensities for contact pairs between interacting TM helices and found that small and polar residues play an important role in interhelical contacts. In TMhit, contact propensities were incorporated with other sequence and structural features for training the SVMs in a novel two-level framework. Compared to the conventional method, the proposed two-level framework not only significantly reduces computational costs but also the number of false positives. Lastly, the development of a new method to predict the residue solvent accessibility of in TM domains is described (manuscript in preparation). The method employs a random forests algorithm for feature selection and regression. To this end, it achieves a mean absolute error of 27.25Å2 and a Pearson’s correlation of 0.50 based on 5-fold cross validation.
    In summary, the presented works in this thesis comprise several computational approaches to facilitate structure/function prediction in membrane proteins. While the growth of membrane protein structure continues to accumulate at a slow pace, bioinformatics methods will play an important role in advancing our understanding in membrane protein structure assembly and function.


    本論文主要探討的題目為 「以機器學習計算方法預測膜蛋白結構上的特性 」。 這些特性包含了膜蛋白(membrane protein)的碩樸預測(topology prediction)、穿膜螺旋(transmembrane helix)之間的交互作用(interactions),和氨基酸空間上的接觸(contacts)預測、還有穿膜螺旋的對於脂質的暴露面積 (lipid exposure surface) 預測。這些不同的特性是在發展結構預測裡重要的一環,特別是針對膜蛋白,因為此類已知的結構甚少。首先,本論文在碩樸預測上,描述一種新的方法叫做SVMtop,利用階級式(hierarchical)的分類法,運用支持向量機(support vector machines)和新的記分函數來預測碩樸。SVMtop在準確率上超越許多已發表方法,特別是穿膜螺旋位置與方向皆正確的情況,此正確率大約70% 。第二,在穿膜螺旋交互作用預測的問題上,我們首先用統計方法計算胺基酸形成接觸的傾向分數(propensity scores),發現體積小的和帶有極性的(polar)胺基酸有較高的傾向形成胺基酸空間上的接觸。我們發展了一套方法名為TMhit,利用二階式(two-level)的架構,同時加入的其他序列和結構的特徵來訓練支持向量機。比較傳統方法,此新二階式系統更能減少多餘的計算和錯誤預測。最後,在預測穿膜螺旋的對於脂質的暴露面積上,本論文描述了一個新方法,運用隨機森林(random forests)來挑選特徵(feature selection)還有回歸(regression)預測。目前以五分交叉確認法(5-fold cross validation),最好的結果為平均絕對誤差為27.25 Å 2 和相連指數(correlation)為0.50。本論文希望能經過這些方法,在預測膜蛋白結構上有長足的進步。由於目前實驗上解出膜蛋白結構仍有許多頻頸,發展生物資訊方法預測膜蛋白的結構及功能的方法,將成為深入了解膜蛋白在生物體中的角色裡非常重要的方向。

    Contents Abstract 中文摘要 Acknowledgement List of figures List of tables List of publications Table of contents 1. Introduction 2. Membrane proteins 2.1 Biological membranes 2.2 Types of membrane proteins 2.3 Targeting and translocation of membrane proteins 2.4 Membrane protein topology and structure assembly 2.4.1 Topogenesis of membrane proteins 2.4.2 Membrane protein structure assembly 2.5 Experimental determination of membrane protein structures 2.6 Predictions of membrane protein topology and structure 2.6.1 Topology prediction 2.6.2 Structure prediction 2.7 Modelling transmembrane helical bundles in silico 3. Introduction to machine learning methods 3.1 Pattern classification 3.2 Feature selection, training, and testing 3.2.1 Probabilistic latent semantic analysis 3.2.2 Embedded feature selection in random forests 3.2.3 Cross validation and statistical tests 3.3 Neural networks 3.4 Support vector machines 3.5 Random forests 4. Membrane protein topology prediction using support vector machines 4.1 Motivation 4.2 Methods 4.2.1 Data sets 4.2.2 Hierarchical SVM classifiers for topology prediction 4.2.3 Input features 4.2.4 Alternating geometric scoring function 4.2.5 Evaluation measures 4.3 Results 4.3.1 Benchmark comparisons with existing methods 4.3.2 Discrimination between soluble and membrane proteins 4.3.3 Analysis of topology scoring function 4.4 Summary 5. Predicting residue contacts and helix-helix interactions in membrane proteins 5.1 Motivation 5.2 Methods 5.2.1 Data sets 5.2.2 Definition of interacting helices 5.2.3 A novel two-level contact prediction framework 5.2.4 Estimation of contact propensities 5.2.5 Input features 5.2.6 Evaluation measures 5.3 Results 5.3.1 Helical contact propensities 5.3.2 Leave-one-out cross validation accuracy 5.3.3 Independent test accuracy 5.3.4 Helix-helix interaction prediction 5.3.5 Analysis of two-level contact prediction framework 5.4 Summary 6. Predicting the lipid exposure of transmembrane helices 6.1 Motivation 6.2 Methods 6.2.1 Data sets 6.2.2 Random forests regression 6.2.3 Input features 6.2.4 Evaluation measures 6.3 Preliminary results 6.3.1 Solvent exposure propensity scales 6.3.2 Feature selection on AAindex database 6.3.3 Comparisons of prediction accuracy 6.4 Summary 7. Concluding remarks and outlook References Appendix List of web servers

    Adamczak, R., Porollo, A. and Meller, J. (2004) Accurate prediction of solvent accessibility using neural networks-based regression, Proteins, 56, 753-767.
    Adamian, L. and Liang, J. (2006) Prediction of transmembrane helix orientation in polytopic membrane proteins, BMC Struct Biol, 6, 13.
    Adamian, L., Nanda, V., DeGrado, W.F. and Liang, J. (2005) Empirical lipid propensities of amino acid residues in multispan alpha helical membrane proteins, Proteins, 59, 496-509.
    Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res, 25, 3389-3402.
    Andersen, O.S. and Koeppe, R.E., 2nd (2007) Bilayer thickness and membrane protein function: an energetic perspective, Annu Rev Biophys Biomol Struct, 36, 107-130.
    Arai, M., Mitsuke, H., Ikeda, M., Xia, J.X., Kikuchi, T., Satake, M. and Shimizu, T. (2004) ConPred II: a consensus prediction method for obtaining transmembrane topology models with high reliability, Nucleic Acids Res, 32, W390-393.
    Bairoch, A. and Apweiler, R. (1997) The SWISS-PROT protein sequence data bank and its supplement TrEMBL, Nucleic Acids Res, 25, 31-36.
    Baldi, P. and Brunak, S. (1998) Bioinformatics: the machine learning approach. In MIT Press (eds), USA.
    Barth, P., Wallner, B. and Baker, D. (2009) Prediction of membrane protein structures with complex topologies using limited constraints, Proc Natl Acad Sci U S A, 106, 1409-1414.
    Beckmann, R., Spahn, C.M., Eswar, N., Helmers, J., Penczek, P.A., Sali, A., Frank, J. and Blobel, G. (2001) Architecture of the protein-conducting channel associated with the translating 80S ribosome, Cell, 107, 361-372.
    Bendtsen, J.D., Kiemer, L., Fausboll, A. and Brunak, S. (2005) Non-classical protein secretion in bacteria, BMC Microbiol, 5, 58.
    Bendtsen, J.D., Nielsen, H., von Heijne, G. and Brunak, S. (2004) Improved prediction of signal peptides: SignalP 3.0, J Mol Biol, 340, 783-795.
    Bendtsen, J.D., Nielsen, H., Widdick, D., Palmer, T. and Brunak, S. (2005) Prediction of twin-arginine signal peptides, BMC Bioinformatics, 6, 167.
    Berks, B.C. (1996) A common export pathway for proteins binding complex redox cofactors?, Mol Microbiol, 22, 393-404.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000) The Protein Data Bank, Nucleic Acids Res, 28, 235-242.
    Bernsel, A., Viklund, H., Falk, J., Lindahl, E., von Heijne, G. and Elofsson, A. (2008) Prediction of membrane-protein topology from first principles, Proc Natl Acad Sci U S A, 105, 7177-7181.
    Beuming, T. and Weinstein, H. (2004) A knowledge-based scale for the analysis and prediction of buried and exposed faces of transmembrane domain proteins, Bioinformatics, 20, 1822-1835.
    Bowie, J.U. (2005) Solving the membrane protein folding problem, Nature, 438, 581-589.
    Breiman, L. (2001) Random forests, Machine Learning, 45, 5−32.
    Casella, G. (1985) An introduction to empirical bayes data analysis, Am Stat, 39, 83-87.
    Chandonia, J.M., Hon, G., Walker, N.S., Lo Conte, L., Koehl, P., Levitt, M. and Brenner, S.E. (2004) The ASTRAL Compendium in 2004, Nucleic Acids Res, 32, D189-192.
    Chang, C.C. and Lin, C.J. (2001) LIBSVM: a Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
    Chang, J.M., Su, E.C., Lo, A., Chiu, H.S., Sung, T.Y. and Hsu, W.L. (2008) PSLDoc: Protein subcellular localization prediction based on gapped-dipeptides and probabilistic latent semantic analysis, Proteins, 72, 693-710.
    Chen, C.P., Kernytsky, A. and Rost, B. (2002) Transmembrane helix predictions revisited, Protein Sci, 11, 2774-2791.
    Cherezov, V., Rosenbaum, D.M., Hanson, M.A., Rasmussen, S.G., Thian, F.S., Kobilka, T.S., Choi, H.J., Kuhn, P., Weis, W.I., Kobilka, B.K. and Stevens, R.C. (2007) High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor, Science, 318, 1258-1265.
    Chiu, Y.Y., Lo, A., Sung, T.Y., Hsu, W.L., (2008) Improved contact prediction using a hierarchical approach with neural networks. Proceedings of the 2nd Taiwan-Japan Young Researchers Conference on Computational and Systems Biology (TJYoung2), Tokyo, Japan.
    Cornette, J.L., Cease, K.B., Margalit, H., Spouge, J.L., Berzofsky, J.A. and DeLisi, C. (1987) Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins, J Mol Biol, 195, 659-685.
    Curran, A.R. and Engelman, D.M. (2003) Sequence motifs, polar interactions and conformational changes in helical membrane proteins, Curr Opin Struct Biol, 13, 412-417.
    Cuthbertson, J.M., Doyle, D.A. and Sansom, M.S. (2005) Transmembrane helix prediction: a comparative evaluation and analysis, Protein Eng Des Sel, 18, 295-308.
    Das, R. and Baker, D. (2008) Macromolecular modeling with rosetta, Annu Rev Biochem, 77, 363-382.
    Dawson, J.P., Weinger, J.S. and Engelman, D.M. (2002) Motifs of serine and threonine can drive association of transmembrane helices, J Mol Biol, 316, 799-805.
    DeGrado, W.F., Gratkowski, H. and Lear, J.D. (2003) How do helix-helix interactions help determine the folds of membrane proteins? Perspectives from the study of homo-oligomeric helical bundles, Protein Sci, 12, 647-665.
    DeLano, W.L. The PyMOL Molecular Graphics System (2002) on World Wide Web UUhttp://www.pymol.orgUU
    Delcour, A.H. (2008) Outer membrane permeability and antibiotic resistance, Biochim Biophys Acta.
    Denzer, A.J., Nabholz, C.E. and Spiess, M. (1995) Transmembrane orientation of signal-anchor proteins is affected by the folding state but not the size of the N-terminal domain, Embo J, 14, 6311-6317.
    Doyle, D.A., Morais Cabral, J., Pfuetzner, R.A., Kuo, A., Gulbis, J.M., Cohen, S.L., Chait, B.T. and MacKinnon, R. (1998) The structure of the potassium channel: molecular basis of K+ conduction and selectivity, Science, 280, 69-77.
    Duda, R.O., Hart, P.E. and Stork, D.G. (2001) Pattern classification (2nd edition). In John Wiley & Sons Inc. (eds), Wiley, USA.
    Dutzler, R., Wang, Y.F., Rizkallah, P., Rosenbusch, J.P. and Schirmer, T. (1996) Crystal structures of various maltooligosaccharides bound to maltoporin reveal a specific sugar translocation pathway, Structure, 4, 127-134.
    Edelsbrunner, H., Facello, M. and Liang, J. (1996) On the definition and the construction of pockets in macromolecules, Pac Symp Biocomput, 272-287.
    Eilers, M., Patel, A.B., Liu, W. and Smith, S.O. (2002) Comparison of helix interactions in membrane and soluble alpha-bundle proteins, Biophys J, 82, 2720-2736.
    Eisenberg, D., Weiss, R.M. and Terwilliger, T.C. (1984) The hydrophobic moment detects periodicity in protein hydrophobicity, Proc Natl Acad Sci U S A, 81, 140-144.
    Elofsson, A. and von Heijne, G. (2007) Membrane protein structure: prediction versus reality, Annu Rev Biochem, 76, 125-140.
    Engelman, D.M. (2005) Membranes are more mosaic than fluid, Nature, 438, 578-580.
    Engelman, D.M., Chen, Y., Chin, C.N., Curran, A.R., Dixon, A.M., Dupuy, A.D., Lee, A.S., Lehnert, U., Matthews, E.E., Reshetnyak, Y.K., Senes, A. and Popot, J.L. (2003) Membrane protein folding: beyond the two stage model, FEBS Lett, 555, 122-125.
    Etchebest, C. and Popot, J.L. (1997) Packing transmembrane α-helices into bundles: computational vs. experimental approaches. In Springer-Verlag (eds), Membrane Protein Assembly, Chapter 14, R.G. Landes Company, USA, pp.221-249.
    Fleishman, S.J. and Ben-Tal, N. (2002) A novel scoring function for predicting the conformations of tightly packed pairs of transmembrane alpha-helices, J Mol Biol, 321, 363-378.
    Fleishman, S.J. and Ben-Tal, N. (2006) Progress in structure prediction of alpha-helical membrane proteins, Curr Opin Struct Biol, 16, 496-504.
    Forrest, L.R., Tang, C.L. and Honig, B. (2006) On the accuracy of homology modeling and sequence alignment methods applied to membrane proteins, Biophys J, 91, 508-517.
    Frishman, D. and Argos, P. (1995) Knowledge-based protein secondary structure assignment, Proteins, 23, 566-579.
    Fuchs, A., Martin-Galiano, A.J., Kalman, M., Fleishman, S., Ben-Tal, N. and Frishman, D. (2007) Co-evolving residues in membrane proteins, Bioinformatics, 23, 3312-3319.
    Gardy, J.L. and Brinkman, F.S. (2006) Methods for predicting bacterial protein subcellular localization, Nat Rev Microbiol, 4, 741-751.
    Gardy, J.L., Spencer, C., Wang, K., Ester, M., Tusnady, G.E., Simon, I., Hua, S., deFays, K., Lambert, C., Nakai, K. and Brinkman, F.S. (2003) PSORT-B: Improving protein subcellular localization prediction for Gram-negative bacteria, Nucleic Acids Res, 31, 3613-3617.
    Garrow, A.G., Agnew, A. and Westhead, D.R. (2005) TMB-Hunt: a web server to screen sequence sets for transmembrane beta-barrel proteins, Nucleic Acids Res, 33, W188-192.
    Gentle, I., Gabriel, K., Beech, P., Waller, R. and Lithgow, T. (2004) The Omp85 family of proteins is essential for outer membrane biogenesis in mitochondria and bacteria, J Cell Biol, 164, 19-24.
    Geourjon, C., Combet, C., Blanchet, C. and Deleage, G. (2001) Identification of related proteins with weak sequence identity using secondary structure information, Protein Sci, 10, 788-797.
    Gimpelev, M., Forrest, L.R., Murray, D. and Honig, B. (2004) Helical packing patterns in membrane and soluble proteins, Biophys J, 87, 4075-4086.
    Goder, V., Junne, T. and Spiess, M. (2004) Sec61p contributes to signal sequence orientation according to the positive-inside rule, Mol Biol Cell, 15, 1470-1478.
    Goder, V. and Spiess, M. (2001) Topogenesis of membrane proteins: determinants and dynamics, FEBS Lett, 504, 87-93.
    Goder, V. and Spiess, M. (2003) Molecular mechanism of signal sequence orientation in the endoplasmic reticulum, Embo J, 22, 3645-3653.
    Grana, O., Baker, D., MacCallum, R.M., Meiler, J., Punta, M., Rost, B., Tress, M.L. and Valencia, A. (2005) CASP6 assessment of contact prediction, Proteins, 61 Suppl 7, 214-224.
    Harrenga, A. and Michel, H. (1999) The cytochrome c oxidase from Paracoccus denitrificans does not change the metal center ligation upon reduction, J Biol Chem, 274, 33296-33299.
    Hastie, T., Tibshirani, R. and Friedman, J.H. (2001) The EM algorithm. In Springer (eds). The Elements of Statistical Learning. New York, pp. 236–243
    Heijne, G.V. (1986) The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology, Embo J, 5, 3021-3027.
    Heinrich, S.U. and Rapoport, T.A. (2003) Cooperation of transmembrane segments during the integration of a double-spanning protein into the ER membrane, Embo J, 22, 3654-3663.
    Henderson, R. (2004) Realizing the potential of electron cryo-microscopy, Q Rev Biophys, 37, 3-13.
    Hessa, T., Kim, H., Bihlmaier, K., Lundin, C., Boekel, J., Andersson, H., Nilsson, I., White, S.H. and von Heijne, G. (2005) Recognition of transmembrane helices by the endoplasmic reticulum translocon, Nature, 433, 377-381.
    Hessa, T., Meindl-Beinker, N.M., Bernsel, A., Kim, H., Sato, Y., Lerch-Bader, M., Nilsson, I., White, S.H. and von Heijne, G. (2007) Molecular code for transmembrane-helix recognition by the Sec61 translocon, Nature, 450, 1026-1030.
    Higy, M., Junne, T. and Spiess, M. (2004) Topogenesis of membrane proteins at the endoplasmic reticulum, Biochemistry, 43, 12716-12722.
    Hirokawa, T., Boon-Chieng, S. and Mitaku, S. (1998) SOSUI: classification and secondary structure prediction system for membrane proteins, Bioinformatics, 14, 378-379.
    Hofmann, T. (1999) Probabilistic Latent Semantic Analysis. Proceedings of Uncertainity in Artificial Intelligence, UAI'99, Stockholm.
    Horimoto, K. and Toh, H. (2001) Statistical estimation of cluster boundaries in gene expression profile data, Bioinformatics, 17, 1143-1151.
    Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S. and O'Shea, E.K. (2003) Global analysis of protein localization in budding yeast, Nature, 425, 686-691.
    Ikeda, M., Arai, M., Okuno, T. and Shimizu, T. (2003) TMPDB: a database of experimentally-characterized transmembrane topologies, Nucleic Acids Res, 31, 406-409.
    Izarzugaza, J.M., Grana, O., Tress, M.L., Valencia, A. and Clarke, N.D. (2007) Assessment of intramolecular contact predictions for CASP7, Proteins, 69 Suppl 8, 152-158.
    Jaakkola, T., Diekhans, M. and Haussler, D. (1999) Using the Fisher kernel method to detect remote protein homologies, Proc Int Conf Intell Syst Mol Biol, 149-158.
    Jayasinghe, S., Hristova, K. and White, S.H. (2001) MPtopo: A database of membrane protein topology, Protein Sci, 10, 455-458.
    Jones, D.T. (2007) Improving the accuracy of transmembrane protein topology prediction using evolutionary information, Bioinformatics, 23, 538-544.
    Jones, D.T., Taylor, W.R. and Thornton, J.M. (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology, Biochemistry, 33, 3038-3049.
    Kall, L., Krogh, A. and Sonnhammer, E.L. (2004) A combined transmembrane topology and signal peptide prediction method, J Mol Biol, 338, 1027-1036.
    Kauko, A., Illergard, K. and Elofsson, A. (2008) Coils in the membrane core are conserved and functionally important, J Mol Biol, 380, 170-180.
    Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T. and Kanehisa, M. (2008) AAindex: amino acid index database, progress report 2008, Nucleic Acids Res, 36, D202-205.
    Kida, Y., Morimoto, F., Mihara, K. and Sakaguchi, M. (2006) Function of positive charges following signal-anchor sequences during translocation of the N-terminal domain, J Biol Chem, 281, 1152-1158.
    Knowles, J. and Gromo, G. (2003) A guide to drug discovery: Target selection in drug discovery, Nat Rev Drug Discov, 2, 63-69.
    Koebnik, R., Locher, K.P. and Van Gelder, P. (2000) Structure and function of bacterial outer membrane proteins: barrels in a nutshell, Mol Microbiol, 37, 239-253.
    Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes, J Mol Biol, 305, 567-580.
    Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying the hydropathic character of a protein, J Mol Biol, 157, 105-132.
    Lacapere, J.J., Pebay-Peyroula, E., Neumann, J.M. and Etchebest, C. (2007) Determining membrane protein structures: still a challenge!, Trends Biochem Sci, 32, 259-270.
    Larranaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armananzas, R., Santafe, G., Perez, A. and Robles, V. (2006) Machine learning in bioinformatics, Brief Bioinform, 7, 86-112.
    Lehr, R. (2006) McNemar’s test. In Informa Healthcare (eds), Encyclopedia of Biopharmaceutical Statistics (2nd Ed), USA.
    Leslie, C., Eskin, E. and Noble, W.S. (2002) The spectrum kernel: a string kernel for SVM protein classification, Pac Symp Biocomput, 564-575.
    Leung, D.H.Y. (2005) Cross-validation in nonparametric regression with outliers. The Annals of Statistics, 33, 2291–2310.
    Lerch-Bader, M., Lundin, C., Kim, H., Nilsson, I. and von Heijne, G. (2008) Contribution of positively charged flanking residues to the insertion of transmembrane helices into the endoplasmic reticulum, Proc Natl Acad Sci U S A, 105, 4127-4132.
    Li, A.J. and Nussinov, R. (1998) A set of van der Waals and coulombic radii of protein atoms for molecular and solvent-accessible surface calculation, packing evaluation, and docking, Proteins, 32, 111-127.
    Li, W. and Godzik, A. (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22, 1658-1659.
    Liang, H.K., Huang, C.M., Ko, M.T. and Hwang, J.K. (2005) Amino acid coupling patterns in thermophilic proteins, Proteins, 59, 58-63.
    Lindman, H. R. (1974). Analysis of variance in complex experimental designs. San Francisco: W. H. Freeman & Co.
    Lo, A., Chiu, H.S., Sung, T.Y. and Hsu, W.L. (2006) Transmembrane helix and topology prediction using hierarchical SVM classifiers and an alternating geometric scoring function, Comput Syst Bioinformatics Conf, 31-42.
    Lo, A., Chiu, H.S., Sung, T.Y., Lyu, P.C. and Hsu, W.L. (2008) Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function, J Proteome Res, 7, 487-496.
    Lo, A., Chiu, Y.Y., Rodland, E.A., Lyu, P.C., Sung, T.Y. and Hsu, W.L. (2009) Predicting helix-helix interactions from residue contacts in membrane proteins, Bioinformatics. (in press)
    Loader, C. (2004) Smoothing: local regression techniques. In Gentle,J. (eds), Hand-book of Computational Statistics. Springer-Verlag, Heidelberg, pp. 540-560
    Lower, M., Weydig, C., Metzler, D., Reuter, A., Starzinski-Powitz, A., Wessler, S. and Schneider, G. (2008) Prediction of extracellular proteases of the human pathogen Helicobacter pylori reveals proteolytic activity of the Hp1018/19 protein HtrA, PLoS ONE, 3, e3510.
    Luirink, J., von Heijne, G., Houben, E. and de Gier, J.W. (2005) Biogenesis of inner membrane proteins in Escherichia coli, Annu Rev Microbiol, 59, 329-355.
    Lundstrom, K. (2004) Structural genomics on membrane proteins: the MePNet approach, Curr Opin Drug Discov Devel, 7, 342-346.
    Lundstrom, K. (2006) Structural genomics for membrane proteins, Cell Mol Life Sci, 63, 2597-2607.
    Martin-Galiano, A.J. and Frishman, D. (2006) Defining the fold space of membrane proteins: the CAMPS database, Proteins, 64, 906-922.
    Matthews, B.W. (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme, Biochim Biophys Acta, 405, 442-451.
    Mehta, C.R. and Patel, N.R. (1997) Exact inference in categorical data, Biometrics, 53, 112-117.
    Meyer, D. (2009) Support vector machines: the R Interface to libsvm in package e1071. Online documentation available at: http://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf
    Miller, C.S. and Eisenberg, D. (2008) Using inferred residue contacts to distinguish between correct and incorrect protein models, Bioinformatics, 24, 1575-1582.
    Mitaku, S., Hirokawa, T. and Tsuji, T. (2002) Amphiphilicity index of polar amino acids as an aid in the characterization of amino acid preference at membrane-water interfaces, Bioinformatics, 18, 608-616.
    Mitchell, T.M. (1997) Machine Learning. WCB-McGraw-Hill, Boston, Mass. 1997, pp. 96-97
    Moller, S., Kriventseva, E.V. and Apweiler, R. (2000) A collection of well characterised integral membrane proteins, Bioinformatics, 16, 1159-1160.
    Monne, M., Hessa, T., Thissen, L. and von Heijne, G. (2005) Competition between neighboring topogenic signals during membrane protein insertion into the ER, Febs J, 272, 28-36.
    Nanbu, K. (1995) Fourier transform method to determine the probability density function from a given set of random samples, Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics, 52, 5832-5838.
    Nickel, W. (2003) The mystery of nonclassical protein secretion. A current view on cargo proteins and potential export routes, Eur J Biochem, 270, 2109-2119.
    O'Mahony, M. (1986). Sensory Evaluation of Food: Statistical Methods and Procedures. CRC Press. pp. 487.
    Ortiz, A.R., Kolinski, A., Rotkiewicz, P., Ilkowski, B. and Skolnick, J. (1999) Ab initio folding of proteins using restraints derived from evolutionary information, Proteins, Suppl 3, 177-185.
    Osborne, A.R., Rapoport, T.A. and van den Berg, B. (2005) Protein translocation by the Sec61/SecY channel, Annu Rev Cell Dev Biol, 21, 529-550.
    Ouyang, Z. and Liang, J. (2008) Predicting protein folding rates from geometric contact and amino acid sequence, Protein Sci, 17, 1256-1263.
    Pappu, R.V., Marshall, G.R. and Ponder, J.W. (1999) A potential smoothing algorithm accurately predicts transmembrane helix packing, Nat Struct Biol, 6, 50-55.
    Park, K.J. and Kanehisa, M. (2003) Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs, Bioinformatics, 19, 1656-1663.
    Park, Y., Hayat, S. and Helms, V. (2007) Prediction of the burial status of transmembrane residues of helical membrane proteins, BMC Bioinformatics, 8, 302.
    Park, Y. and Helms, V. (2007) On the derivation of propensity scales for predicting exposed transmembrane residues of helical membrane proteins, Bioinformatics, 23, 701-708.
    Pilpel, Y., Ben-Tal, N. and Lancet, D. (1999) kPROT: a knowledge-based scale for the propensity of residue orientation in transmembrane segments. Application to membrane protein structure prediction, J Mol Biol, 294, 921-935.
    Popot, J.L. (1993) Integral membrane proteins structure: transmembrane α-helices as autonomous folding domains. Curr Opin Struct Biol, 3,532–540.
    Popot, J.L. and Engelman, D.M. (1990) Membrane protein folding and oligomerization: the two-stage model, Biochemistry, 29, 4031-4037.
    Popot, J.L. and Engelman, D.M. (2000) Helical membrane protein folding, stability, and evolution, Annu Rev Biochem, 69, 881-922.
    Prosser, R.S., Evanics, F., Kitevski, J.L. and Al-Abdul-Wahid, M.S. (2006) Current applications of bicelles in NMR studies of membrane-associated amphiphiles and proteins, Biochemistry, 45, 8453-8465.
    Punta, M. and Rost, B. (2005) PROFcon: novel prediction of long-range contacts, Bioinformatics, 21, 2960-2968.
    Rapoport, T.A. (2007) Protein translocation across the eukaryotic endoplasmic reticulum and bacterial plasma membranes, Nature, 450, 663-669.
    Rapoport, T.A., Goder, V., Heinrich, S.U. and Matlack, K.E. (2004) Membrane-protein integration and the role of the translocation channel, Trends Cell Biol, 14, 568-575.
    Reynolds, S.M., Kall, L., Riffle, M.E., Bilmes, J.A. and Noble, W.S. (2008) Transmembrane topology and signal peptide prediction using dynamic bayesian networks, PLoS Comput Biol, 4, e1000213.
    Robert, V., Volokhina, E.B., Senf, F., Bos, M.P., Van Gelder, P. and Tommassen, J. (2006) Assembly factor Omp85 recognizes its outer membrane protein substrates by a species-specific C-terminal motif, PLoS Biol, 4, e377.
    Rohl, C.A., Strauss, C.E., Misura, K.M. and Baker, D. (2004) Protein structure prediction using Rosetta, Methods Enzymol, 383, 66-93.
    Rost, B., Fariselli, P. and Casadio, R. (1996) Topology prediction for helical transmembrane proteins at 86% accuracy, Protein Sci, 5, 1704-1718.
    Russ, W.P. and Engelman, D.M. (2000) The GxxxG motif: a framework for transmembrane helix-helix association, J Mol Biol, 296, 911-919.
    Saeys, Y., Inza, I. and Larranaga, P. (2007) A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507-2517.
    Sakaguchi, M., Tomiyoshi, R., Kuroiwa, T., Mihara, K. and Omura, T. (1992) Functions of signal and signal-anchor sequences are determined by the balance between the hydrophobic segment and the N-terminal charge, Proc Natl Acad Sci U S A, 89, 16-19.
    Sal-Man, N., Gerber, D., Bloch, I. and Shai, Y. (2007) Specificity in transmembrane helix-helix interactions mediated by aromatic residues, J Biol Chem, 282, 19753-19761.
    Samanta, U., Bahadur, R.P. and Chakrabarti, P. (2002) Quantifying the accessible surface area of protein residues in their local environment, Protein Eng, 15, 659-667.
    Schlessinger, A., Punta, M. and Rost, B. (2007) Natively unstructured regions in proteins identified from contact predictions, Bioinformatics, 23, 2376-2384.
    Shackelford, G. and Karplus, K. (2007) Contact prediction using mutual information and neural nets, Proteins, 69 Suppl 8, 159-164.
    Sim, J., Kim, S.Y. and Lee, J. (2005) Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method, Bioinformatics, 21, 2844-2849.
    Stevens, T.J. and Arkin, I.T. (2001) Substitution rates in alpha-helical transmembrane proteins, Protein Sci, 10, 2507-2517.
    Su, E.C., Chiu, H.S., Lo, A., Hwang, J.K., Sung, T.Y. and Hsu, W.L. (2007) Protein subcellular localization prediction based on compartment-specific features and structure conservation, BMC Bioinformatics, 8, 330.
    Sui, H., Han, B.G., Lee, J.K., Walian, P. and Jap, B.K. (2001) Structural basis of water-specific transport through the AQP1 water channel, Nature, 414, 872-878.
    Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P. and Feuston, B.P. (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling, J Chem Inf Comput Sci, 43, 1947-1958.
    Torres, J., Stevens, T.J. and Samso, M. (2003) Membrane proteins: the 'Wild West' of structural biology, Trends Biochem Sci, 28, 137-144.
    Tusnady, G.E., Dosztanyi, Z. and Simon, I. (2005) PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank, Nucleic Acids Res, 33, D275-278.
    Tusnady, G.E., Kalmar, L. and Simon, I. (2008) TOPDB: topology data bank of transmembrane proteins, Nucleic Acids Res, 36, D234-239.
    Tusnady, G.E. and Simon, I. (1998) Principles governing amino acid composition of integral membrane proteins: application to topology prediction, J Mol Biol, 283, 489-506.
    Ubarretxena-Belandia, I. and Engelman, D.M. (2001) Helical membrane proteins: diversity of functions in the context of simple architecture, Curr Opin Struct Biol, 11, 370-376.
    Ulmschneider, M.B., Sansom, M.S. and Di Nola, A. (2005) Properties of integral membrane protein structures: derivation of an implicit membrane potential, Proteins, 59, 252-265.
    Van den Berg, B., Clemons, W.M., Jr., Collinson, I., Modis, Y., Hartmann, E., Harrison, S.C. and Rapoport, T.A. (2004) X-ray structure of a protein-conducting channel, Nature, 427, 36-44.
    Vapnik, V. (1995) The Nature of Statistical Learning Theory. In Springer-Verlag (eds). New York.
    Viklund, H. and Elofsson, A. (2004) Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information, Protein Sci, 13, 1908-1917.
    Viklund, H. and Elofsson, A. (2008) OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar, Bioinformatics, 24, 1662-1668.
    Viklund, H., Granseth, E. and Elofsson, A. (2006) Structural classification and prediction of reentrant regions in alpha-helical transmembrane proteins: application to complete genomes, J Mol Biol, 361, 591-603.
    von Heijne, G. (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule, J Mol Biol, 225, 487-494.
    von Heijne, G. (2006) Membrane-protein topology, Nat Rev Mol Cell Biol, 7, 909-918.
    von Heijne, G. (2007) The membrane protein universe: what's out there and why bother?, J Intern Med, 261, 543-557.
    Walian, P., Cross, T.A. and Jap, B.K. (2004) Structural genomics of membrane proteins, Genome Biol, 5, 215.
    Wallin, E. and von Heijne, G. (1998) Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms, Protein Sci, 7, 1029-1038.
    Walther, D.M. and Rapaport, D. (2009) Biogenesis of mitochondrial outer membrane proteins, Biochim Biophys Acta, 1793, 42-51.
    Weihofen, A., Binns, K., Lemberg, M.K., Ashman, K. and Martoglio, B. (2002) Identification of signal peptide peptidase, a presenilin-type aspartic protease, Science, 296, 2215-2218.
    White, S.H. (2004) The progress of membrane protein structure determination, Protein Sci, 13, 1948-1949.
    White, S.H. and von Heijne, G. (2005) Transmembrane helices before, during, and after insertion, Curr Opin Struct Biol, 15, 378-386.
    White, S.H. and von Heijne, G. (2008) How translocons select transmembrane helices, Annu Rev Biophys, 37, 23-42.
    White, S.H. and Wimley, W.C. (1999) Membrane protein folding and stability: physical principles, Annu Rev Biophys Biomol Struct, 28, 319-365.
    Wiener, M.C. and White, S.H. (1992) Structure of a fluid dioleoylphosphatidylcholine bilayer determined by joint refinement of x-ray and neutron diffraction data. II. Distribution and packing of terminal methyl groups, Biophys J, 61, 428-433.
    Yarov-Yarovoy, V., Baker, D. and Catterall, W.A. (2006) Voltage sensor conformations in the open and closed states in ROSETTA structural models of K(+) channels, Proc Natl Acad Sci U S A, 103, 7292-7297.
    Yin, H., Slusky, J.S., Berger, B.W., Walters, R.S., Vilaire, G., Litvinov, R.I., Lear, J.D., Caputo, G.A., Bennett, J.S. and DeGrado, W.F. (2007) Computational design of peptides that target transmembrane helices, Science, 315, 1817-1822.
    Yu, C.S., Chen, Y.C., Lu, C.H. and Hwang, J.K. (2006) Prediction of protein subcellular localization, Proteins, 64, 643-651.
    Yu, C.S., Lin, C.J. and Hwang, J.K. (2004) Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions, Protein Sci, 13, 1402-1406.
    Yuan, Z., Zhang, F., Davis, M.J., Boden, M. and Teasdale, R.D. (2006) Predicting the solvent accessibility of transmembrane residues from protein sequence, J Proteome Res, 5, 1063-1070.
    Zhang, Y., Devries, M.E. and Skolnick, J. (2006) Structure modeling of all identified G protein-coupled receptors in the human genome, PLoS Comput Biol, 2, e13.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE