研究生: |
李智宇 Li, Chih-Yu |
---|---|
論文名稱: |
演化樹的一致樹尋找方法之研究 A Study on Consensus Methods for Phylogenies |
指導教授: |
王炳豐
Wang, Biing-Feng |
口試委員: |
王家祥
Wang, Jia-Shung 黃耀廷 Huang, Yao-Ting |
學位類別: |
碩士 Master |
系所名稱: |
|
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 英文 |
論文頁數: | 56 |
中文關鍵詞: | 演算法 、演化樹 |
外文關鍵詞: | Algorithm, Phylogeny |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
演化樹是分析生物演化關係的重要資料結構。然而,根據不同資訊或是方法建構的演化樹可能會有不同的結構。目前有數種方法解決此種衝突,其中最常被使用的一種為一致樹。此方法建構出數個演化樹的一致樹以統整其共有的資訊。 Adams 於 1972 年提出第一個一致樹的方法,稱為 Adams 一致樹。此後,有多種一致樹被提出且深入研究。其中有些也被實作於常用的生物資訊軟體中。
本論首先分析各種一致樹,介紹其定義並且統整其已知演算法之時間。此外,本論文亦提出一個時間複雜度 O(k^2n)
的演算法,其中 n 為物種數而 k 為輸入之演化樹個數。在此之前本問題的最佳演算法為 Jansson et al.
提出,時間複雜度為 min{O(kn^2), O(kn(k + log^2 n))}。Jansson et al. 亦建議改進其 O(kn(k + log^2 n)
時間之演算法以到 O(k^2n) 之時間,本論文所提出之演算法有效達成此一目標。
Phylogenies are important structures for analyzing the evolutionary rela-tionships among species. However, phylogenies obtained from different data sets or methods may lead to different structures. Several methods have been proposed to resolve such conflicts. One of the most popular approaches is the consensus tree method, which constructs a consensus tree to summarize the in-formation common to a collection of different phylogenies on the same set of species. Adams introduced the first method called Adams consensus tree in 1972. Since then, numerous consensus tree methods have been proposed and extensively studied. Some of them are also implemented in popular computational phylogenetic software packages.
In this thesis, a comprehensive review on consensus tree methods is first presented. For each method, its definition is introduced and its known complexity results are summarized. Additionally, this thesis also provides an efficient O(k^2n)-time algorithm for the frequency difference consensus tree problem, which is one of the more recent consensus tree methods. The previous best upper bound of this problem is min{O(kn^2), O(kn(k + log^2n))} by Jansson et al., where n is the number of species and k is the number of input trees. Jansson et al. suggested further improving their O(kn(k + log^2n)) upper bound to O(k^2n) as an open problem. The algorithm presented in this thesis gives a positive answer to their question.
1. Abbott, R. J. et al., 2000. Migration and refugia in the Arctic. Science, 25 8, Volume 289, pp. 1343-1346.
2. Adams III, E. N., 1972. Consensus techniques and the comparison of taxonomic trees. Systematic Zoology, 21(4), pp. 390-397.
3. Adams III, E. N., 1986. N-trees as nestings: complexity, similarity, and consensus. Journal of Classification, 3(2), pp. 299-317.
4. Aho, A. V., Sagiv, Y., Szymanski, T. G. & Ullman, J. D., 1981. Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM Journal of Computing, 8, 10(3), pp. 405-421.
5. Amenta, N., Clarke, F. & St. John, K., 2003. A linear-time majority tree algorithm. s.l., s.n., pp. 216-227.
6. Amir, A. & Keselman, D., 1997. Maximum agreement subtree in a set of evolutionary trees: metrics and efficient algorithms. SIAM Journal on Computing, 26(6), pp. 1656-1669.
7. Bandelt, H.-J. & Dress, A., 1989. Weak hierarchies associated with similarity measures--an additive clustering technique. Bulletin of Mathematical Biology, 51(1), pp. 133-166.
8. Barthélemy, J. P. & McMorris, F. R., 1986. The median procedure for n-trees. Journal of Classification, Volume 3, pp. 329-334.
9. Baum, B. R., 1992. Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon, 2, 41(1), pp. 3-10.
10. Berry, V. & Bryant, D., 1999. Faster reliable phylogenetic analysis. s.l., s.n., pp. 59-68.
11. Berry, V. & Gascuel, O., 2000. Inferring evolutionary trees with strong combinatorial evidence. Theoretical Computer Science, 17 6, 240(2), pp. 271-298.
12. Boivin, G. et al., 2002. Virological features and clinical manifestations associated with human metapneumovirus: a new paramyxovirus responsible for acute respiratory-tract infections in all age groups. The Journal of Infectious Diseases, 1 11, 186(9), pp. 1330-1334.
13. Bremer, K., 1990. Combinable component consensus. Cladistics, 6(4), pp. 369-372.
14. Bryant, D., 2003. A classification of consensus methods for phylogenetics. In: M. F. Janowitz, et al. eds. Bioconsensus. Providence(RI): American Mathematical Society, pp. 163-184.
15. Bryant, D. & Berry, V., 2001. A structured family of clustering and tree construction methods. Advances in Applied Mathematics, Volume 27, pp. 705-732.
16. Bryant, D. & Moulton, V., 1999. A polynomial time algorithm for constructing the refined Buneman tree. Applied Mathematics Letter, Volume 12, pp. 51-56.
17. Bryant, D. & Waddell, P., 1998. Rapid evaluation of least-squares and minimum-evolution criteria on phylogenetic trees. Molecular Biology and Evolution, Volume 15, pp. 1346-1359.
18. Buneman, O. P., 1971. The recovery of trees from measures of dissimilarity. In: F. R. Hodson, D. G. Kendall & P. Tautu, eds. Mathematics in the Archaeological and Historical Sciences. Edinburgh: Edinburgh University Press, pp. 387-395.
19. Constantinescu, M. & Sankoff, D., 1995. An efficient algorithm for supertrees. Journal of Classification, Volume 12, pp. 101-112.
20. Cotton, J. A. & Wilkinson, M., 2007. Majority-rule supertrees. Systematic Biology, 56(3), pp. 445-452.
21. Day, W. H. E., 1985. Optimal algorithms for comparing trees with labeled leaves. Journal of Classification, 12, 2(1), pp. 7-28.
22. Day, W. H. E. & Sankoff, D., 1986. Computational complexity of inferring phylogenies by compatibility. Systematic Zoology, 6, 35(2), pp. 224-229.
23. Dong, J., Fernández-Baca, D., McMorris, F. R. & Powers, R. C., 2010. Majority-rule (+) consensus trees. Mathematical Biosciences, 228(1), pp. 10-15.
24. Felsenstein, J., 2004. Inferring Phylogenies. Sunderland(MA.): Sinauer Associates, Inc..
25. Felsenstein, J., 2005. PHYLIP, version 3.6 Software package, Department of Genome Sciences, University of Washington. Seattle: s.n.
26. Foulds, L. R. & Graham, R. L., 1982. The Steiner problem in phylogeny is NP-complete. Advances in Applied Mathematics, Volume 3, pp. 43-49.
27. Goloboff, P. A. et al., 2003. Improvements to resampling measures of group support. Cladistics, 19(4), pp. 324-332.
28. Goloboff, P. A., Farris, J. S. & Nixon, K. C., 2008. TNT, a free program for phylogenetic analysis. Cladistics, 24(5), pp. 774-786.
29. Gordon, A. D., 1980. On the Assessment and Comparison of Classifications. In: R. Tomassone, ed. Analyse de Données et Informatique. Le Chesnay: INRIA.
30. Hendy, M. D., Little, C. H. C. & Penny, D., 1984. Comparing trees with pendant vertices labelled. SIAM Journal on Applied Mathematics, 10, 44(5), pp. 1054-1065.
31. Henzinger, M. R., King, V. & Warnow, T., 1999. Constructing a tree from homeomorphic subtrees, with application to computational evolutionary biology. Algorithmica, Volume 24, pp. 1-13.
32. Holder, M. T., Sukumaran, J. & Lewis, P. L., 2008. A justification for reporting the majority-rule consensus tree in Bayesian phylogenetics. Systematic Biology, 57(5), pp. 814-821.
33. Jansson, J., Li, Z.-X. & Sung, W.-K., 2017. On finding the Adams consensus tree. Information and Computation, Volume 256, pp. 334-347.
34. Jansson, J., Ramesh, R., Shen, C. & Sung, W.-K., 2018. Algorithms for the majority rule (+) consensus tree and the frequency difference consensus tree. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 15(1), pp. 15-26.
35. Jansson, J., Shen, C. & Sung, W.-K., 2016. Improved algorithms for constructing consensus trees. Journal of the ACM, 6, Volume 63.
36. Jansson, J. & Sung, W.-K., 2013. Constructing the R* consensus tree of two trees in subcubic time. Algorithmica, Volume 66, pp. 329-345.
37. Jansson, J., Sung, W.-K., Vu, H. & Yiu, S.-M., 2016. Faster algorithms for computing the R* consensus tree. Algorithmica, Volume 76, pp. 1224-1244.
38. Kannan, S., Warnow, T. & Yooseph, S., 1998. Computing the local consensus of trees. SIAM Journal on Computing, 27(6), pp. 1695-1724.
39. Kubicka, E., Kubicki, G. & McMorris, F. R., 1995. An algorithm to find agreement subtrees. Journal of Classification, Volume 12, pp. 91-99.
40. Lapointe, J. F. & Cucumel, G., 1997. The average consensus procedure: combination of weighted trees containing identical or overlapping sets of taxa. Systematic Biology, 1 6, 46(2), pp. 306-312.
41. Leung, M.-Y., Paszkowski, C. A. & Russell, A. P., 2014. Genetic structure of the endangered greater short-horned lizard (Phrynosoma hernandesi) in Canada: evidence from mitochondrial and nuclear genes. Canadian Journal of Zoology, 92(10), pp. 875-883.
42. Luo, Z.-X., Ji, Q., Wible, J. R. & Yuan, C.-X., 2003. An early Cretaceous tribosphenic mammal and metatherian evolution. Science, Volume 302, pp. 1934-1940.
43. Margush, T. & McMorris, F. R., 1981. Consensus n-trees. Bulletin of Mathematical Biology, Volume 43, pp. 239-244.
44. McMorris, F. R., Meronk, D. B. & Neumann, D. A., 1983. A view of some consensus methods for trees. s.l., Springer-Verlag, pp. 122-126.
45. McMorris, F. R. & Wilkinson, M., 2011. Conservative supertree. Systematic Biology, 6 1, 60(2), pp. 232-238.
46. Moncalvo, J.-M.et al., 2002. One hundred and seventeen clades of euagarics. Molecular Phylogenetics and Evolution, Volume 23, pp. 357-400.
47. Nelson, G., 1979. Cladistic Analysis and Synthesis: Principles and Definitions, with a Historical Note on Adanson's Familles Des Plantes (1763–1764). Systematic Biology, 1 3, 28(1), pp. 1-21.
48. Neumann, D. A., 1983. Faithful consensus methods for n-trees. Mathematical Biosciences, Volume 63, pp. 271-287.
49. Ng, M. P. & Wormald, N. C., 1996. Reconstruction of rooted trees from subtrees. Discrete Applied Mathematics, Volume 69, pp. 19-31.
50. Page, D. M., 1990. Tracks and trees in the antipodes: A reply to Humphries and Seberg. Systematic Zoology, 39(3), pp. 288-299.
51. Page, R. D. M., n.d. COMPONENT, Tree comparison software for Microsoft Windows. 2.0 ed. London: The Natural History Museum.
52. Phillips, C. & Warnow, T. J., 1996. The assymetric median tree - a new model for building consensus. Discrete Applied Mathematics, Volume 71, pp. 311-335.
53. Ragan, M. A., 1992. Phylogenetic inference based on matrix representation of trees. Molecular Phylogenetics and Evolution, 3, 1(1), pp. 53-58.
54. Raxworthy, C. J., Forstner, M. R. J. & Nussbaum, R. A., 2002. Chameleon radiation by oceanic dispersal. Nature, 14 2, Volume 415, pp. 784-787.
55. Rohlf, F. J., 1982. Consensus indices for comparing classifications. Mathematical Biosciences, Volume 59, pp. 131-144.
56. Ronquist, F., 1998. Fast Fitch-parsimony algorithms for large data sets. Cladistics, 14(4), pp. 387-400.
57. Steel, M., 1992. The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification, Volume 9, pp. 91-116.
58. Steel, M., 2016. Phylogeny: Discrete and Random Processes in Evolution. s.l.:SIAM, 2016.
59. Steel, M. & Velasco, J. D., 2014. Axiomatic opportunities and obstacles for inferring a species tree from gene trees. Systematic Biology, 63(5), pp. 772-778.
60. Steel, M. & Warnow, T., 1993. Kaikoura tree theorems: computing the maximum agreement subtree. Information Processing Letters, Volume 48, pp. 77-82.
61. Stinebrickner, R., 1984. s-consensus trees and indices. Bulletin of Mathematical Biology, 46(5-6), pp. 923-935.
62. Stinebrickner, R., 1986. s-consensus index method: an additional axiom. Journal of Classification, 3(2), pp. 319-327.
63. Swofford, D. L., 1991. When are phylogeny estimates from molecular and morphological data incongruent?. In: M. M. Miyamoto & J. Cracraft, eds. Phylogenetic Analysis of DNA Sequences. New York: Oxford University Press, pp. 295-333.
64. Swofford, D. L., 2003. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Sunderland(Massachusetts): Sinauer Associates.
65. Swofford, D. L., Olsen, G. J., Waddell, P. J. & Hillis, D. M., 1996. Phylogenetic inference. In: D. M. Hillis, C. Moritz & B. K. Mable, eds. Molecular Systematics. 2 ed. s.l.:Sinauer, pp. 407-514.
66. Wareham, H., 1985. An efficient algorithm for computing Ml consensus trees, B. Sc. Honours thesis. s.l.:Memorial University of Newfoundland.