研究生: |
邱毓翰 Yu Han-Chiou |
---|---|
論文名稱: |
在有限制情況下之多重序列排比 Constrained Multiple Sequences Alignment |
指導教授: |
唐傳義教授
Chuan-Yi Tang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2002 |
畢業學年度: | 90 |
語文別: | 中文 |
論文頁數: | 33 |
中文關鍵詞: | 多重序列排比 、序列排比 、有限制情況下之序列排比 |
外文關鍵詞: | multiple sequence alignment, constrained sequence alignment, constrained alignment |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在這篇論文當中我們設計了一個演算法,稱為「在有限制情況下的多重序列排比」,由於之前的多重序列排比方法不一定能夠滿足生物學家的需求,特別是某些被認定應該必須排比在一起的字元。所以我們設計了一個方法能夠滿足生物學家所認定必須排比在一起的字元。首先我們先設計了一個三維空間動態程式設計的方法來產生在有限制情況下時兩條序列的排比,接著我們再使用Feng and Doolittle [8]所提逐步排比的方法(progressive alignment approach),對所有的序列做任兩條有限制的序列排比,利用此結果建立一個距離矩陣(distance matrix),然後根據距離矩陣中序列的相似程度利用Kruskal演算法來建立最小擴張樹。最後根據Kruskal找尋最小擴張樹的順序逐一合併所有的兩條序列排比,形成有的限制多重序列排比。在我們的實驗之下,證明了我們的方法能夠的將生物學家所提出的限制條件,完全的排比在一起。而我們演算法的時間複雜度,在K條序列下為需花O(Kn4),其中n為所有序列中的最大長度。所以我們的方法可算是一個不錯的方法。
We design a new algorithm of computing constrained multiple sequence alignment (CMSA) for guaranteeing that generated alignment satisfies the user- specified constraints that some particular residues should be aligned together. The first step of our strategy is design a constrained pairwise sequence alignment. Next, based on the concept of progressive alignment, we use the constrained pairwise sequence alignment to progressively merge the sequences. The time complexity of our CMSA algorithm for aligning K sequences is O(Kn4),where n is the maximum of lengths of sequences. We experimented our algorithm on RNases sequences with known structure and results of our experiment are all the important residues of active sites are well aligned together.
1. Dan Gusfield (1997) Algorithms on strings, Trees, and Sequences: computer
science and computational biology. Cambridge university press
2. Pavel A. Pevzmer (1999) Computational Molecular Biology: An Algorithmic
Approach. The MIT Press
3. Joao Setubal and Joao Meidanis (1997) Introduction to computation molecular Biology. University of Campinas, Brazil
4. Michael S. Waterman Introduction to Computational Biology
Maps, sequences and genome
5.Thomas H Cormen and Charles E. Leisersoan Introductions to Algorithms.
The MIT Press
6.R. C. T. Lee and R. C. Chang (2001) Introduction to the design and analysis of
algorithms second edition.
7.S. B. Needleman and C. D. Wunsch(1970) A general method applicable to the search for similarities in amino acid sequences of two proteins. J. Mol. Biol.
48,443-453.
8.Feng,D.F. and Doolittle, R. F. ,Progressive sequence alignment as prerequisite to correct phylogenetic trees, Journal of Molecular Evolution,25(1987)351-360.
9. .Morgenstern, B., e Dress, A., Wener, T. Multiple DNA and protein sequences alignment based on segment to segment comparison. Proc. Natl. Acad. Sci.
USA,93(1996)12098-12103.
10.Morgenstern, B., Frech, K., Dress, A., Wener, T. DIALIGN: Finding local similarities by multiple sequences alignment.Bioinformatics,14(1998)290-294
11. Morgenstern, B., DIALIGN2: improvement of the segment to segment approach to multiple sequences alignment. Bioinformatics,15(1999)211-218.
12.Apostolico, A., Giancarlo, R., Sequence Alignment in Molecular Biology.
Journal of Computational Biology,5(1998)173-196.
13.清華大學資訊工程研究所林世杰碩士論文