簡易檢索 / 詳目顯示

研究生: 陳柏均
Chen, Po-Chun
論文名稱: 方文匹配之高效索引
Efficient Index for Square Pattern Matching
指導教授: 韓永楷
Hon, Wing-Kai
口試委員: 李哲榮
Lee, Che-Rung
王弘倫
Wang, Hung-Lung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 46
中文關鍵詞: 模式匹配方文匹配索引方文的週期性引理
外文關鍵詞: Pattern Matching, Square Matching, Indexing, Periodicity Lemma for Squares
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一個字串S 被稱為方文( square) 若S 是由兩個相同的字串串接而成的。 兩個相同長度的字串P 和Q, 如果對於P 的任意子字串, 它是方文當且僅當Q 中對應的子字串也是方文, 我們稱之為方文匹配( square match) , 方文模式匹配( Square Pattern Matching) 問題的目標是在一段長文本中找到所有與查詢模式P 相匹配的方文, 該問題可應用於生物資訊學、 數據壓縮與模式識別等領域。
    本論文提出了一種針對方文模式匹配問題的創新索引框架。 我們的方法基於
    對與方文相關的字串週期性的研究發現, 以實現更優的時間與空間效率。


    A string S is called a square if S is formed by concatenating two identical strings. Two strings P and Q of the same length are called square match if for every substring of P , it is a square if and only if the corresponding substring of Q is also a square. The Square Pattern Matching problem targets in locating all the square matches of a query pattern P in a long text, which may be applied in bioinformatics, data compression, and pattern recognition tasks.
    This thesis introduces an innovative indexing framework optimized for the Square Pattern Matching problem. Our approach is based on new findings on the periodicity of strings that are related to squares, and achieves space compres- sion and computation efficiency.

    Abstract (Chinese) I Abstract II Contents III 1 Introduction 1 2 Preliminaries 4 2.1 Notations and naming conventions 4 2.2 Periodicity of strings 5 2.3 Properties for set of prefixes 6 3 YZ Lemma 7 4 LSS and LPS Arrays 9 5 LPS Table and Diagonal Tree 15 5.1 LPS table 15 5.2 Diagonal tree 16 6 Period Properties in Squares 19 6.1 Case I 20 6.2 Case II 20 6.3 Case III 21 6.4 Case IV 22 6.5 Case V 23 7 Proofs of the Claims 25 7.1 Claim 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 7.2 Claim 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 7.3 Claim 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 7.4 Claim 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 7.5 Claim 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 7.6 Claim 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 7.7 Claim 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7.8 Claim 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 8 Proper and Improper Squares 36 8.1 Improper Square Lemma: Case 1 38 8.2 Improper Square Lemma: Case 2 39 8.3 Improper Square Lemma: Case 3 41 9 Efficient Storage and Construction for LPS Table 42 9.1 Efficient storage for LPS table 42 9.2 Efficient construction for LPS table 43 10 Conclusions and Future Work 45 Bibliography 46

    [1] Maxime Crochemore and Wojciech Rytter. Jewels of Stringology. World Sci- entific Publishing, 2002.
    [2] P. Ferragina and G. Manzini. Opportunistic Data Structures with Applications. In Annual Symposium on Foundations of Computer Science (FOCS), pages 390–398, 2000.
    [3] Donald E. Knuth, James H. Morris, Jr., and Vaughan R. Pratt. Fast Pattern Matching in Strings. SIAM Journal on Computing (SICOMP), 6(2):323–350, 1977.
    [4] Shinya Nagashita and Tomohiro I. PalFM-index: FM-index for Palindrome Pattern Matching. In Annual Symposium on Combinatorial Pattern Matching (CPM), pages 23:1–23:15, 2023.
    [5] Jouni Paulus and Anssi Klapuri. Music Structure Analysis by Finding Re- peated Parts. In ACM Workshop on Audio and Music Computing Multimedia (AMCMM), pages 59—-68, 2006.
    [6] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct Index- able Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. ACM Transactions on Algorithms (TALG), 3(4), 2007.
    [7] Esko Ukkonen. On-line Construction of Suffix Trees. Algorithmica, 14:249–260, 1995.
    [8] Karen Usdin. The Biological Effects of Simple Tandem Repeats: Lessons from the Repeat Expansion Diseases. Genome Research, 18(7):1011–1019, 2008.

    QR CODE