研究生: |
施承廷 Shih, Cheng Ting |
---|---|
論文名稱: |
利用一級與三級結構進行兩個RNA的比對 Pairwise Alignment of RNA Using Primary and Tertiary Structures |
指導教授: |
盧錦隆
Lu, Chin Lung |
口試委員: |
李家同
Lee, Chia-Tung 唐傳義 Tang, Chuan Yi 邱顯泰 Chiu, Hsien-Tai |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 35 |
中文關鍵詞: | 生物資訊 、核醣核酸三級結構 、結構比對 、結構字元 、親合性互動式 |
外文關鍵詞: | bioinformatics, RNA tertiary structure, structural alignment, structural alphabet, affinity propagation |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,存放在PDB資料庫中的RNA 三級(3D)結構數量持續地增加中。由於RNA分子的三級結構在演化的速度上比一級序列來的慢,所以RNA的結構比對可以對RNA分子的功能與演化關係帶來更具意義的了解,而這是無法藉由只分析一級序列資訊所能夠偵側出來。在2010年,我們實驗室開發出一套名叫iPARTS的軟體工具,可以去比對兩個RNA三級結構並決定出他們之間的結構相似程度。iPARTS的基本步驟如下:首先利用二個假扭轉角(pseudo-torsion angles) η 與θ把PDB資料庫中RNA結構的核苷酸畫在二維的平面以得到一個Ramachandran-like圖。接著使用親合性互動式(affinity propagation)分群演算法對Ramachandran-like圖上的核苷酸進行分群,以得到一個含有23個核苷酸形構的結構字元集(structural alphabet)。再來利用這個結構字元集把RNA 3D結構轉成1D的結構字元序列。最後使用傳統的序列比對演算法去比較二條結構字元編碼的序列,以決定出他們的結構相似程度。事實上,自從2010年我們發表iPARTS之後,許多新的RNA 3D結構已經被存放在PDB資料庫中。除此之外,最近有一個研究指出核苷酸一級序列對RNA結構的比對可以提供有用的額外資訊。因此,在本研究中我們首先使用目前可以從PDB資料庫中取得的RNA 3D結構去更新目前的iPARTS。接著我們設計出兩個把RNA 一級序列的資訊結合到iPARTS中的方法,使得iPARTS在同時考慮RNA一級序列和三級結構的資訊下,可以更準確地去比對兩個RNA的3D結構。最後我們的實驗證實了新版本的iPARTS,我們稱之為iPARTS2,確實比其之前的版本在RNA結構比對與功能設定上有比較好的表現,除此之外,我們提出的兩個把RNA 一級序列的資訊結合到iPARTS2中的方法,在RNA功能設定上,又更進一步比只考慮三級結構的iPARTS2還要好。
Recently, the number of RNA tertiary (3D) structures deposited in the PDB database continues to grow. Since structures of molecules evolve more slowly than their sequences, the structural comparison of RNAs can bring more sig-nificant insights into their functions and evolutionary relationships that would not be detected by analyzing sequence information alone. In 2010, our labor-atory developed a tool, called iPARTS, that aligns two RNA 3D structures and determines their structural similarities. The basic steps of our iPARTS are as follows. First, a Ramachandran-like diagram of RNAs was derived by plotting nucleotides of RNA structures in the PDB database on a 2D axis us-ing their two pseudo-torsion angles η and θ. Then, affinity propagation clus-tering algorithm was applied to the η-θ plot to obtain a structural alphabet (SA) of 23 nucleotide conformations. Next, the SA was used to transform RNA 3D structures into 1D sequences of SA letters. Finally, classical se-quence alignment methods were utilized to compare two SA-encoded se-quences and determine their structural similarities. In fact, many new RNA 3D structures have been deposited in the PDB database since we published iPARTS in 2010. In addition, there is a recent study to report that nucleotide 1D sequences of RNAs can provide useful additional information for their structural alignment. In this study, therefore, we first update our current iPARTS by utilizing currently available RNA 3D structures in the PDB data-base and then design two methods to incorporate RNA 1D sequence infor-mation into our iPARTS such that it can more accurately align two RNA 3D structures based on both their sequences and structures. Finally, our experi-mental results demonstrate that the new version of iPARTS, named iPARTS2, indeed outperforms its previous version in terms of accuracies of RNA struc-ture alignment and functional assignment. In addition, both the methods we propose in this study to incorporate 1D sequence information into iPARTS2 further outperform iPARTS2 with considering only 3D structures of RNAs in terms of accuracy of functional assignment.
[1] Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shin-dyalov, I. N. and Bourne, P. E. (2000) The protein data bank. Nucleic Acids Research, 28, 235–242.
[2] Kolodny, R. and Linial, N. (2004) Approximate protein structural alignment in poly-nomial time. Proceedings of the National Academy of Sciences of the United States of America, 101, 12201–12206.
[3] Dror, O., Nussinov, R. and Wolfson, H. J. (2005) ARTS: alignment of RNA tertiary structures. Bioinformatics, 21, 47–53.
[4] Dror, O., Nussinov, R. and Wolfson, H. J. (2006) The ARTS web server for aligning RNA tertiary structures. Nucleic Acids Research, 34, W412–W415.
[5] Ferr`e, F., Ponty, Y., Lorenz, W. A. and Clote, P. (2007) DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities. Nucleic Acids Research, 35, W659–W668.
[6] Chang, Y. F., Huang, Y. L. and Lu, C. L. (2008) SARSA: a web tool for structural alignment of RNA using a structural alphabet. Nucleic Acids Research, 36, W19–W24.
[7] Capriotti, E. and Marti-Renom, M. A. (2008) RNA structure alignment by a unit-vector approach. Bioinformatics, 24, i112–i118.
[8] Capriotti, E. and Marti-Renom, M. A. (2009) SARA: a server for function annotation of RNA structures. Nucleic Acids Research, 37, W260–W265.
[9] Bauer, R. A., Rother, K., Moor, P., Reinert, K., Steinke, T., Bujnicki, J. M. and Preissner, R. (2009) Fast structural alignment of biomolecules using a hash table, n-grams and string descriptors. Algorithms, 2, 692–709.
[10] Wang, C. W., Chen, K. T. and Lu, C. L. (2010) iPARTS: an improved tool of pairwise alignment of RNA tertiary structures. Nucleic Acids Research, 38, W340–W347.
[11] Rahrig R. R., Leontis N. B. and Zirbel C. L. (2010) R3D align: global pairwise alignment of RNA 3D structures using local superpositions. Bioinformatics, 26, 2689–2697.
[12] Hoksza, D. and Svozil, D. (2012) Efficient RNA pairwise structure comparison by SETTER method. Bioinformatics, 28, 1858–1864.
[13] Čech, P., Svozil, D. and Hoksza, D. (2012) SETTER: web server for RNA structure comparison. Nucleic Acids Research, 40, W42–W48.
[14] He, G., Steppi, A., Laborde, J., Srivastava, A., Zhao, P. and Zhang, J. (2014) RASS: a web server for RNA alignment in the joint sequence-structure space. Nucleic Acids Re-search, 42, W377–W381.
[15] Wadley, L. M., Keating, K. S., Duarte, C. M. and Pyle, A. M. (2007) Evaluating and learning from RNA pseudotorsional space: quantitative validation of a reduced repre-sentation for RNA structure. Journal of Molecular Biology, 372, 942–957.
[16] Duarte, C. M. and Pyle, A. M. (1998) Stepping through an RNA structure: a novel ap-proach to conformational analysis. Journal of Molecular Biology, 284, 1465–1478.
[17] Frey, B. J. and Dueck, D. (2007) Clustering by passing messages between data points. Science, 315, 972–976.
[18] Leontis, N. and Westhof, E. (2012) RNA 3D structure analysis and prediction, Series Nucleic Acids and Molecular Biology. Springer, Berlin and Heidelberg.
[19] Xu, R. and Wunsch, D. I. (2005) Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16, 645–678.
[20] Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the United States of America, 89, 10915–10919.
[21] Kemena, C., Bussotti, G., Capriotti, E., Marti-Renom, MA., Notredame, C. (2013) Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package. Bioinformatics, 29, 1112–1119.
[22] Needleman, S. and Wunsch, C. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Evolution, 48, 443–453.
[23] Setubal, J. and Meidanis, J. (1997) Introduction to Computational Molecular Biology, PWS Publishing Company.
[24] Smith, T. and Waterman, M. (1981) Identification of common molecular subsequences. Journal of Molecular Biology, 147, 195–197.
[25] Klosterman, P. S., Tamura, M., Holbrook, S. R. and Brenner, S. E. (2002) SCOR: a structural classification of RNA database. Nucleic Acids Research, 30, 392–394.
[26] Tamura, M., Hendrix, D. K., Klosterman, P. S., Schimmelman, N. R., Brenner, S. E. and Holbrook, S. R. (2004) SCOR: structural classification of RNA, version 2.0. Nucleic Ac-ids Research, 32, D182–D184.
[27] Subbiah, S., Laurents, D. V. and Levitt, M. (1993) Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Current Biology, 3, 141–148.
[28] Kolodny, R., Koehl, P. and Levitt, M. (2005) Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. Journal of Molecular Biol-ogy, 346, 1173–1188.