簡易檢索 / 詳目顯示

研究生: 張家銘
Jia-Ming Chang
論文名稱: 緊湊集合關係及其在演化樹和多重序列排比之評估的應用
Compact Set Relation and Its Application in the Evaluation of the Evolution Tree and Multiple Sequence Alignment
指導教授: 唐傳義
Chuan Yi Tang
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2002
畢業學年度: 90
語文別: 中文
中文關鍵詞: Compact SetMultiple Sequence AlignmentEvolution Tree
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現有評估演化樹(Evolution Tree)或多重序列比對(Multiple Sequence Alignemnt)的理論,
    都過於強調數學上的絕對最佳解,例如:兩兩距離和(Sum-of-Pair)、樹型大小(Tree Size),等等。

    本篇論文中,我們將從相對遠近關係是否保持的角度,提出一個新的評估標準,即以緊湊集合定義的鄰近關係。

    就一個好的演化樹或多重序列比對而言,任三個物種在演化樹上或比對後的遠近關係,應與建樹前或比對前的遠近關係一致;

    然而有些物種間的遠近關係或許不是那麼重要,因此我們利用緊湊集合的特性

    (即集合內任意兩元素之間最長的距離,仍小於該集合中任一元素與集合外元素之間最短的距離),選出較為重要的遠近關係,

    並且利用這些關係是否獲得保持來評估演化樹或多重序列比對的優劣。

    本論文最後進行一系列的實驗,利用在緊湊集合所定義的鄰近關係,評估現有一些知名的建立演化樹與多重序列比對的程式。

    此外本論文也提出一個新的多重序列比對演算法--YAMA-MST。實驗數據證明,YAMA-MST比起現有MSA的程式,

    更能符合評估演化樹的需要。


    1 INTRODUCTION 1 1.1 Estimating the Evolution Tree . . . . . . . . . . . . . . . . . . . . . 1 1.2 Estimating Multiple Sequence Alignment . . . . . . . . . . . . . . . 3 2 THE COMPACT SET 5 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Constructing Compact Sets . . . . . . . . . . . . . . . . . . . . . . 6 2.3 The Neighboring Relation With Repsect To Compact Sets . . . . . 10 3 ESTIMATING THE EVOLTION TREE 13 3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 The Estimation Algorithm . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Exprimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4 ESTIMATING MIULTIPLE SEQUENCE ALIGNMENT 18 4.1 The Estimation Algorithm . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 YAMA-MST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Exprimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 CONCLUSIONS AND FUTURE WORK 23

    [1] V. Bafna, E.L. Lawler, and P.A. Pevzner. Approximation algorithms for
    multiple sequence alignment. Theoretical Computer Science, 182:233–244,
    1997.
    [2] W. Bains. Multan: a program to align multiple dna sequences. Mucleic Acids
    Research, 14:159–177, 1986.
    [3] G.J. Barton and M.J.E. Sternberg. A stragtegy for the rapid multiple align-ment
    of protein. Journal of Molecular Biology, 19:327–337, 1987.
    [4] P. Bonizzoni and G. Della Vedova. The complexity of multiple sequence
    alignment with SP-score that is a metric. Theoretical Computer Science,
    259:63–79, 2001.
    [5] J. Carlos and J. Meidanis. Introdution to computational molecular biology.
    PWS publishing company, 1997.
    [6] H. Carrillo and D. Lipman. The multiple sequence alignment problem in
    biology. SIAM Journal on Applied Mathematics, 48(5):1073–1082, 1988.
    [7] S.C. Chan, A.K.C. Wong, and D.K.Y. Chiu. A survey of multiple sequence
    comparison methods. Bulletin of Mathematical Biology, 48:1073–1082, 1992.
    [8] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein. Introduction to
    Algorithms. MIT Press, Cambridge, MA, second edition, 2001.
    [9] F. Corpet. Multiple sequence alignment with hierarchical clustering. Nucleic
    Acids Research, 16:10881–10890, 1988.
    25
    [10] E. Dekel, J. Hu, and W. Ouyang. An optimal algorithm for finding compact
    sets. Information Processing Letters, 44:285–289, 1992.
    [11] D.J.Lipman and S.F.Altschul. A tool for multiple sequences alignment. In
    Proc.Nat.Acad. Sci. U.S.A., 1989.
    [12] D.Snakoff. Simultaneous solution of rna folding, alignment and protosequence
    prolblems. SIAM J. Appl. Math., 1985.
    [13] J.T. Fang. Constructing and application of evolutionary tree’s measurement
    model. Master’s thesis, Department of Computer Science National Tsing Hua
    University, 2000.
    [14] D.F. Feng and R.F. Doolittle. Progressive sequence alignment as a prerequisite
    to correct phylogenetic trees. Journal of Molecular Evolution, 25:351–360,
    1987.
    [15] H.N. Gabow and R.E. Tarjan. A linear-time algoritm for a special case of
    disjoint set union. J. Comput. System Sci., 30:209–221, 1985.
    [16] G.H. Gonnet, C. Korostensky, and S. Benner. Evaluation measures of multiple
    sequence alinments. Journal of Computational Biology, 7:261–276, 2000.
    [17] D. Gusfield. Efficient methods for multiple sequence alignment with guaran-teed
    error bounds. Bulletin of Mathematical Biology, 55:141–154, 1993.
    [18] D. Harel and R.E. Tarjan. Fast algorithm for finding nearest common ances-tor.
    SIAM. J. Comput., 13:285–289, 1984.
    [19] S. Henikoff and J.G. Henikoff. Amino acid substitution matrices from protein
    blocks. In Proceedings of the National Academy of Sciences USA, volume 8,
    pages 1154–1171, 1998.
    [20] D.G. Higgins and P. Sharpe. CLUSTAL: a package for performing multiple
    sequence alignment on a microcomputer. Gene, 73:237–244, 1988.
    [21] D.G. Higgins, J.D. Thompson, and T.J. Gibson. Using clustal for multiple
    sequence alignments. Methods in Enzymology, 266:383–402, 1996.
    26
    [22] T. Jiang, P. Kearney, and M. Li. A polynomial time approximation scheme
    for inferring evolutionary trees from quartet topologies and its application.
    SIAM. J. Comput., 30:1942–1961, 2001.
    [23] S.K. Kim. A note on finding compact sets in graphs represented by an adja-cency
    list. Information Processing Letters, 57:335–338, 1996.
    [24] M. Li, B. Ma, and L. Wang. Near optimal multiple alignment within a band
    in polynomial time. In STOC, pages 425–434, Portland, Oregon, 2000.
    [25] W. H. Li. Molecular Evolution. Sinauer Associates, Inc., 1997.
    [26] S. Needleman and C. Wunsch. A general method applicable to the search for
    similarities in the amino acid sequence of two proteins. Journal of Molecular
    Evolution, 48:443–453, 1970.
    [27] P.A. Pevzner. Multiple alignment, communication cost, and graph matching.
    SIAM Journal on Applied Mathematics, 52:1763–1779, 1992.
    [28] N. Saitou and M. Nei. The neighbor-joining method: a new method for
    reconstructing phylogenetic trees. Molecular Biology and Evolution, 4:406–
    425, 1987.
    [29] B. Schieber and U. Vishkin. On finding lowest common ancestor:simplification
    and parallelization. SIAM J. Comput., 17:1253–1262, 1988.
    [30] S.F.Altschul and D.J.Lipman. Trees,star and mutiple biological sequence alig-ment.
    SIAM J. Appl. Math., 1989.
    [31] D.D. Sleator and R.E. Tarjan. A data structure for dynamic trees. J. Comput.
    System Sci., 13:362–391, 1983.
    [32] P.H.A. Sneath and R.R. Sokal. Numerical Taxonomy, pages 230–234. Free-man,
    San Francisco, CA, 1973.
    [33] R.E. Tarjan. Data Structure and Network Algoritms. SIAM, Philadelphia,
    PA, 1983.
    27
    [34] W.R. Taylor. Multiple sequence alignmnet by a pairwise alignment. Computer
    Applications in Biosciences, 3:81–87, 1987.
    [35] J.D. Thompson, D.G. Higgins, and T.J. Gibson. CLUSTAL W: improving
    the sensitivity of progressive multiple sequence alignment through sequence
    weighting, position specific gap penalties, and weight matrix choice. Nucleic
    Acids Research, 22:4673–4680, 1994.
    [36] J.D. Thompson, F. Plewinak, and O. Poch. Balibase: a benchmark alignment
    database for the evolution of multiple sequence alignments. Bioinformatics,
    15:87–88, 1999.
    [37] J.D. Thompson, F. Plewinak, and O. Poch. A comprehensive comparison
    of multiple sequence alignment programs. Nucleic Acids Res., 27:2682–2690,
    1999.
    [38] T.K. Vintsyuk. Speech discrimination by dynamic programming. Comput.,
    4:52–57, 1967.
    [39] L. Wang and T. Jiang. On the complexity of multiple sequence alignment.
    Journal of Computational Biology, 1(4):337–348, 1994.
    [40] B.Y. Wu. Constructing the maximum consensus tree from rooted. In the
    proceedings of National Computer Symposium, 2001.
    [41] D. Zivkovic. A fast algorithm for finding the compact sets. Information
    Processing Letters, 38:339–342, 1991.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE