簡易檢索 / 詳目顯示

研究生: 黃彥菱
Huang, Yen-Lin
論文名稱: Solving Genome Rearrangement Problems Using Permutation Groups
利用排列群解基因體重組問題
指導教授: 唐傳義
Tang, Chuan-Yi
盧錦隆
Lu, Chin Lung
口試委員:
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 66
中文關鍵詞: 基因體重組排列群排序代數反轉區塊互換融合分裂易位
外文關鍵詞: genome rearrangement, permutation groups, sorting, algebra, reversal, block-interchange, fusion, fission, translocation
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • With the growing availability of complete genome sequences, genome rearrangement studies based on genome-wide analysis of gene orders play an important role in phylogenetic tree reconstruction. In contrast to traditional alignment approaches for detecting point mutations (e.g., substitutions, insertions and deletions of nucleotides/amino acids), genome rearrangements are based on comparison of gene orders to detect large-scale mutations, such as reversals, transpositions, block-interchanges (also called generalized transpositions), fusions, fissions and translocations. Given two gene orders of genomes with the same set of genes, the genome rearrangement problem aims to compute a minimum sequence of rearrangement operations required to transform one genome into the other. The genome rearrangement problem can also be viewed as a problem of sorting a permutation, if the given genomes are represented by permutations with one having positive, sorted order. In this thesis, by using permutation groups in algebra, we first present an O(n + δlogδ) time algorithm for solving the problem of sorting by block-interchanges, where n is the number of genes and is the minimum number of rearrangement operations required to sort a genome. We then present an O(δn) time algorithm for the problem of sorting by reversals and block-interchanges with a weight proportion 1:2. In addition, we further consider additional translocations (including fusions and fissions), which are weighted 1, when dealing with multi-chromosomal genomes and consequently propose the O(δn) time algorithms for the problem with linear and circular chromosomal genomes, respectively. Based on the algorithms mentioned above, we have finally implemented a web server that allows biologists to perform genome rearrangement analysis involving reversals, block-interchanges and translocations (including fusions and fissions), and also infer phylogenetic trees of genomes being considered based on their pairwise genome rearrangement distances. In this web server, we also provide biologists to perform the so-called jackknife analysis to evaluate statistical reliability of the constructed phylogenetic trees.


    隨著愈來愈多完整的基因體序列被定序出來,分析基因體上基因次序的基因體重組研究在演化樹的建構上扮演著重要的角色。不同於偵測點突變(例如核苷酸或胺基酸的取代、插入及刪除)的傳統對齊方法,基因體重組利用基因次序的比較去偵側大規模的突變,像是反轉(reversals)、移位(transpositions)、區塊互換(block-interchanges)、融合(fusions)、分裂(fissions)和易位(translocations)。已知二個基因體上共有基因的基因次序,基因體重組問題的目的是要去計算出一個最少的基因體重組序列把其中一個基因體的基因次序轉換成另一個基因體的基因次序。如果其中一個已知基因體的基因次序被表示成一個已排序好的正整數序列的話,那麼基因體重組的問題便可以被視為一種排序(sorting)的問題。在本論文中,我們利用代數的排列群首先提出一個時間複雜度為Ο(n +δlogδ)的演算法來解決區塊互換的排序問題(sorting by block-interchanges),其中n是基因的個數,δ是把基因體給排序好所需的最少基因體重組個數。我們接著提出一個時間複雜度為Ο(δn )的演算法來解決反轉及區塊互換的排序問題(sorting by reversals and block-interchanges),其中反轉及區塊互換的權重比為1:2。除此之外,當處理多條染色體的基因體時,我們進一步地考慮額外的易位(包括融合及分裂),將其權重設為1,並提出時間複雜度皆為Ο(δn )的兩個演算法分別排序線狀與環狀多條染色體的基因次序。最後,我們把上述的演算法實作成一個軟體工具可讓生物學家們透過網際網路來使用,此軟體工具可允許生物學家們進行含有反轉、區塊互換與易位(包括融合及分裂)的基因體重組分析,以及根據分析出來的兩兩基因體之間的重組距離來推測出基因體的種族樹。在這個軟體工具中,我們也提供生物學家們進行所謂的刀切分析法(jackknife)可以用來評估所建構種族樹的統計可信度。

    1 Introduction 1 2 Basic Concepts of Permutation Groups 7 3 Sorting by Block-interchanges 11 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Sorting a Permutation by Block-interchanges . . . . . . . . . . . . . . . . . . . 12 3.3 Brief Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Sorting by Reversals, Generalized Transpositions and Translocations 20 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Permutation Groups versus Genome Rearrangements . . . . . . . . . . . . . . 22 4.3 Algorithms for Sorting by Weighted Reversals and Block-interchanges . . . . . 27 4.4 Algorithms for Sorting by Weighted Reversals, Block- Interchanges, Translocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.5 Brief Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5 SoRT2: A Tool for Sorting Genomes and Reconstructing Phylogenetic Trees by Reversals, Generalized Transpositions and Translocations 47 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.3 Tool Implementation and Usage . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.4.1 Performance on Simulated Datasets . . . . . . . . . . . . . . . . . . . . 53 5.4.2 Eleven Metazoan mtDNAs . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.4.3 Six Mammalian Genomes . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.4.4 Seven Bacterial Genomes . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5 Brief Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6 Conclusion and Future Works 59

    [1] Adam, Z. and Sankoff, D. 2008. The ABCs of MGR with DCJ. Evolutionary Bioinformatics,
    4, 69–74.
    [2] Alekseyev, M. A. 2008. Multi-break rearrangements and breakpoint re-uses: from circular
    to linear genomes. Journal of Computational Biology, 15, 1117–1131.
    [3] Alekseyev, M. A. and Pevzner, P. A. 2008. Multi-break rearrangements and chromosomal
    evolution. Theoretical Computer Science, 395, 193–202.
    [4] Bader, D. A., Moret, B. M., and Yan, M. 2001. A linear-time algorithm for computing
    inversion distance between signed permutations with an experimental study. Journal of
    Computational Biology, 8, 483–491.
    [5] Bafna, V. and Pevzner, P. A. 1993. Genome rearrangements and sorting by reversals.
    In Proceedings of the 34th Annual IEEE Symposium on Foundations of Computer Science,
    148V157.
    [6] Bafna, V. and Pevzner, P. A. 1998. Sorting by transpositions. SIAM Journal on Discrete
    Mathematics, 11, 221–240.
    [7] Belda, E., Moya, A., and Silva, F. J. 2005. Genome rearrangement distances and gene
    order phylogeny in γ-Proteobacteria. Molecular Biology Evolutionary, 22, 1456–1467.
    [8] Bergeron, A. 2001. A very elementary presentation of the HanenhalliVPevzner theory.
    Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching (CPM
    2001), 106V117. Springer. [updated version appeared in (2005) Discrete Appl. Math., 146,
    134V145.]
    61
    [9] Bergeron, A., Mixtacki, J., and Stoye, J. 2006. On sorting by translocations. Journal of
    Computational Biology, 13, 567–578.
    [10] Bergeron, A., Mixtacki, J., and Stoye, J. 2006. A unifying view of genome rearrangements.
    In Bucher, P. and Moret, B. M. E., eds., Proceedings of the 6th International Workshop on
    Algorithms in Bioinformatics (WABI 2006), Lecture Notes in Computer Science, volume
    4175, 163–173. Springer.
    [11] Bergeron, A., Mixtacki, J., and Stoye, J. 2009. A new linear time algorithm to compute
    the genomic distance via the double cut and join distance. Theoretical Computer Science,
    410, 5300–5316.
    [12] Berman, P. and Hannenhalli, S. 1996. Fast sorting by reversals. In Proceedings of the
    7th Annual Symposium on Combinatorial Pattern Matching (CPM1996), Lecture Notes in
    Computer Science, 1075, 168–185. Springer.
    [13] Berman, P., Hannenhalli, S. and Karpinski, M. 2002. 1.375-approximation algorithm for
    sorting by reversals. In Proceedings of the 10th Annual European Symposium on Algorithms
    (ESA2002), Lecture Notes in Computer Science, 2461, 200–210. Springer.
    [14] Berman, P. and Karpinski, M. 1999. On some tighter inapproximability results (extended
    abstract) In Proceedings of the Twenty-sixth International Colloquium on Automata,
    Language and Programming (ICALP), Lecture Notes in Computer Science, 1999, 200–209.
    Springer.
    [15] Blanchette, M., Kunisawa, T., and Sankoff, D. 1996. Parametric genome rearrangement.
    Gene, 172, GC11–GC17.
    [16] Blanchette, M., Kunisawa, T., and Sankoff, D. 1999. Gene order breakpoint evidence in
    animal mitochondrial phylogeny. Journal of Molecular Evolution, 49, 193–203.
    [17] Bona, M. and Flynn, R. 2009. The average number of block interchanges needed to sort
    a permutation and a recent result of Stanley. Information Processing Letters, 109, 927–931.
    [18] Bourque, G. and Pevzner, P. A. 2002. Genome-scale evolution: reconstructing gene orders
    in the ancestral species. Genome Research, 12, 26–36.
    62
    [19] Caprara, A. 1999. Sorting permutations by reversals and Eulerian cycle Decompositions.
    SIAM Journal on Discrete Mathematics, 12, 91–110.
    [20] Christie, D. A. 1996. Sorting by block-interchanges. Information Processing Letters, 60,
    165–169.
    [21] Cosner, M. E., Jansen, R. K., Moret, B. M. E., Raubeson, L. A., sanWang, L., Warnow,
    T., and Wyman, S. 2000. An empirical comparison of phylogenetic methods on chloroplast
    gene order data in Campanulaceae. In Sankoff, D. and Nadeau, J. H., (eds.), Comparative
    Genomics, Kluwer Academic Publishers, London pp. 99–121.
    [22] Dobzhansky, T. and Sturtevant, A. H. 1938. Inversions in the chromosomes of drosophila
    pseudoobscure. Genetics, 23, 28–64.
    [23] Elias, I. and Hartman, T. 2006. A 1.375-approximation algorithm for sorting by transpositions.
    IEEE/ACM Transactions on Computational Biology and Bioinformatics, 3, 369–379.
    [24] Eriksen, N. 2002. (1+ε)-approximation of sorting by reversals and transpositions. Theoretical
    Computer Science, 289, 517–529.
    [25] Farris, J. S., Albert, V. A., K‥allersj‥o, M., Lipscomb, D., and Kluge, A. G. 1996. Parsimony
    jackknifing outperforms neighbor-joining. Cladistics, 12, 99–124.
    [26] Feij?ao, P. and Meidanis, J. 2009. SCJ: a variant of breakpoint distance for which sorting,
    genome median and genome halving problems are easy. Lecture Notes in Bioinformatics,
    5724, 85–96.
    [27] Felsenstein, J. 1989. PHYLIP: phylogeny inference package (version 3.2). Cladistics, 5,
    164–166.
    [28] Feng, J. and Zhu, D. 2007. Faster algorithms for sorting by transpositions and sorting by
    block interchanges. ACM Transactions on Algorithms, 3, 25.
    [29] Fertin, G., Labarre, A., Rusu, I., Tannier, E., and Vialette, S., Combinatorics of Genome
    Rearrangements, The MIT Press, 2009.
    [30] Fraleigh, J. B. 2003. A First Course in Abstract Algebra. Addison-Wesley, 7th edition.
    63
    [31] Hannenhalli, S. 1996. Polynomial algorithm for computing translocation distance between
    genomes. Discrete Applied Mathematics, 71, 137–151.
    [32] Hannenhalli, S. and Pevzner, P. A. 1995. Transforming men into mice (polynomial algorithm
    for genomic distance problem). In Proceedings of the 36th IEEE Symposium on
    Foundations of Computer Science (FOCS 1995), 581–592. IEEE Computer Society.
    [33] Hannenhalli, S. and Pevzner, P. A. 1999. Transforming cabbage into turnip: Polynomial
    algorithm for sorting signed permutations by reversals. Journal of the ACM, 46, 1–27.
    [34] Hartman, T. and Sharan, R. 2005. A 1.5-approximation algorithm for sorting by transpositions
    and transreversals. Journal of Computer and System Sciences 70, 300–320.
    [35] Huang, Y.-L. and Lu, C. L. 2010. Sorting by reversals, generalized transpositions and
    translocations using permutation groups. Journal of Computational Biology, 17, 685–705.
    [36] Jones, N. C. and Pevzner, P. A., An Introduction to Bioinformatics Algorithms, The
    MIT Press, 2004.
    [37] Kaplan, H., Shamir, R., and Tarjan, R. E. 1999. Faster and simpler algorithm for sorting
    signed permutations by reversals. SIAM Journal on Computing, 29, 880–892.
    [38] Kaplan, H. and Verbin, E. 2003. Efficient data structures and a new randomized approach
    for sorting signed permutations by reversals. In Proceedings of the 14th Symposium
    on Combinatorial Pattern Matching , Lecture Notes in Computer Science, 2676, 170–185.
    Springer.
    [39] Kececioglu, J. and Sankoff, D. 1993. Exact and approximation algorithms for the inversion
    distance between two chromosomes. In Proceedings of the 4th Annual Symposium
    on Combinatorial Pattern Matching (CPM 1993), 87–105, Springer.
    [40] Lin, C. H., Zhao, H., Lowcay, S. H., Shahab, A., and Bourque, G. 2010. webMGR: an
    online tool for the multiple genome rearrangement problem. Bioinformatics, 26, 408–410.
    [41] Lin, Y. C., Lu, C. L., Chang, H.-Y. and Tang, C. Y. 2005. An efficient algorithm for
    sorting by block-interchanges and its application to the evolution of vibrio species. Journal
    of Computational Biology, 12, 102–112.
    64
    [42] Lin, Y. C., Lu, C. L., Liu, Y.-C., and Tang, C. Y. 2006. SPRING: a tool for the analysis
    of genome rearrangement using reversals and block-interchanges. Nucleic Acids Research,
    34, 696–699.
    [43] Lin, Y. and Moret, B. M. E. 2008. Estimating true evolutionary distances under the DCJ
    model. Bioinformatics, 24, i114–i122.
    [44] Lu, C. L., Huang, Y. L., Wang, T. C., et al. 2006. Analysis of circular genome rearrangement
    by fusions, fissions and block-interchanges. BMC Bioinformatics, 7.
    [45] Lu, C. L., Wang, T. C., Lin, Y. C., and Tang, C. Y. 2005. ROBIN: a tool for genome
    rearrangement of block-interchanges. Bioinformatics, 21, 2780–2782.
    [46] Meidanis, J. and Dias, Z. 2000. An alternative algebraic formalism for genome rearrangements.
    In Sankoff, D. and Nadeau, J. H., eds., Comparative Genomics: Empirical and
    Analytical Approaches to Gene Order Dynamics, Map Alignment and Evolution of Gene
    Families, 213–223. Kluwer Academic Press.
    [47] Meidanis, J. and Dias, Z. 2001. Genome rearrangements distance by fusion, fission, and
    transposition is easy. In Navarro, G., ed., Proceedings of the 8th International Symposium
    on String Processing and Information Retrieval (SPIRE 2001), 250–253. IEEE Computer
    Society.
    [48] Meidanis, J. and Setubal, J., Computational Molecular Biology, PWS publishing, 1997.
    [49] Mira, C. and Meidanis, J. 2007. Sorting by block-interchanges and signed reversals.
    In Proceedings of the International Conference on Information Technology (ITNG 2007),
    670–676. IEEE Computer Society.
    [50] OBrien, S. J., Genetics Maps: Locus Maps of Complex Genomes. 6th ed. Cold Spring
    Harbor, ME: Cold Spring Harbor Lab. Press, 1993.
    [51] Ozery-Flato, M. and Shamir, R. 2006. An O(n3/2√log n) algorithm for sorting by reciprocal
    translocations. In Lewenstein, M. and Valiente, G., eds., Proceedings of the 17th Annual
    Symposium on Combinatorial Pattern Matching (CPM 2006), Lecture Notes in Computer
    Science, volume 4009, 258–269. Springer.
    65
    [52] Palmer, J. D. and Herbon, L. A. 1988. Plant mitochondrial DNA evolves rapidly in
    structure, but slowly in sequence. Journal of Molecular Evolution, 28, 87–97.
    [53] Pevzner, P. and Tesler, G. 2003. Genome rearrangements in mammalian evolution: lessons
    from human and mouse genomes. Genome Research, 13, 37–45.
    [54] Sankoff, D. 2003. Rearrangement and chromosomal evolution. Current Opinion in Genetics
    and Development, 13, 583–587.
    [55] Sankoff, D., Leduc, G., Antoine, N., et al. 1992. Gene order comparisons for phylogenetic
    inference: evolution of the mitochondrial genome. Proceedings of the National Academy of
    Sciences, 89, 6575–6579.
    [56] Swenson, K.M., Rajan, V., Lin, Y., and Moret, B.M.E. 2009. Sorting signed permutations
    by inversions in O(nlogn). In Batzoglou, S. (ed.), Proceedings of the 13th Annual International
    Conference on Research in Computational Molecular Biology, 386–399. Springer.
    [57] Tannier, E., Bergeron, A., and Sagot, M.-F. 2007. Advances on sorting by reversals.
    Discrete Applied Mathematics, 155, 881–888.
    [58] Tesler, G. 2002. GRIMM: genome rearrangements web server. Bioinformatics, 18, 492–
    493.
    [59] Watterson, G. A., Ewens, W. J., Hall, T. E. and Morgan, A. 1982. The chromosome
    inversion problem. Journal of Theoreticla Biology, 19, 1-7.
    [60] Yancopoulos, S., Attie, O., and Friedberg, R. 2005. Efficient sorting of genomic permutations
    by translocation, inversion and block interchange. Bioinformatics, 21, 3340–3346.
    [61] Zhao, H. and Bourque, G. 2009. Recovering genome rearrangements in the mammalian
    phylogeny. Genome Research, 19, 934–942.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE