簡易檢索 / 詳目顯示

研究生: 林政宏
Cheng-Hung Lin
論文名稱: 正規表示法比對之演算法與硬體架構設計
Efficient Algorithm and Architecture Design for Regular Expression Matching
指導教授: 張世杰
Shih-Chieh Chang
口試委員:
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 75
中文關鍵詞: 網路入侵偵測系統正規表示法樣式比對
外文關鍵詞: NIDS, regular expression, pattern matching
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 網路入侵偵測系統(network intrusion detection system, NIDS)的主要功能為檢查網路封包的內容是否包含有害或可疑的攻擊特徵。這些特徵描述包括服務阻斷攻擊(denial of service attacks)、端口掃描(port scans)與惡意軟體(malware)的行為。為了有效描述攻擊特徵,正規表示法(regular expressions)被廣泛運用在包括Snort、Bro與ClamAV等入侵偵測系統上。由於網路複雜度與網路攻擊與日遽增,傳統以軟體為主的入侵偵測系統無法滿足網路效能的需求,因此有許多研究提出硬體架構以加速樣式比對(pattern matching)的效率,這些硬體架構可大致區分為邏輯架構(logic architecture)與記憶體架構(memory architecture)。
    邏輯架構主要實現在Field-Programmable Gate Array (FPGA)上,因為FPGA允許不斷地更新病毒特徵,此外邏輯架構容易處理複雜的正規表示法特徵,例如:‘*’, ‘|’, 與‘+’等等。然而隨著病毒特徵的大量增加,如何降低邏輯架構的面積成為非常重要的課題,本論文第一部份提出一種新穎的硬體架構,可以抽出並分享共同的特徵,以降低邏輯架構的面積。
    另一方面,記憶體架構亦被廣泛使用於入侵偵測系統上,因為記憶體架構具有重複規劃(re-configurability)與規模擴充(scalability)的優點。然而隨著病毒特徵的大量增加,記憶體架構一樣面臨記憶體爆量的問題。由於記憶體架構的效能、價格與耗能直接與記憶體大小相關,因此降低記憶體使用量對於記憶體架構而言非常重要。本論文第二部分提出一個新穎的樣式比對(pattern-matching)演算法可以有效降低記憶體架構的記憶體使用量。
    然而,記憶體架構對於特定複雜的正規表示法特徵面臨記憶體爆量的問題。理論證明對於特定的正規表示法特徵,其對應的DFA會產生指數(exponential) 大小的狀態機(state machine)。本論文第三部份針對特定的複雜正規表示法特徵,提出一個新的記憶體架構,藉由加入有限的邏輯電路以改善傳統記憶體架構處理此類正規表示法特徵的能力。


    The main purpose of a network intrusion detection system (NIDS) is to inspect the packet header and payload against thousands of predefined malicious or suspicious patterns. These patterns describe behaviors such as denial of service attacks, port scans, or malware. To efficiently represent suspicious patterns, regular expressions are commonly adopted such as Snort[22], Bro[24], and ClamAV[25] because they have better expressive power and flexibility than explicit string patterns. Due to the increasing complexity of network traffic and the growing number of attacks, traditional software-based NIDS will become inadequate for networking needs due to its slowness. To speed up pattern matching, many researchers have proposed hardware approaches which can be classified into two main categories, the logic and the memory architectures.
    The logic architectures are mostly implemented on Field-Programmable Gate Array (FPGA) because FPGA allows for updating new attack patterns. In addition, the logic architecture is easy to handle certain types of regular expressions containing meta-characters, such as ‘*’, ‘|’, and ‘+’, etc. However, due to the increasing number of attacks, it is important to develop a new methodology to minimize the circuit area of the large number of regular expressions. Although the minimization of logic equations has been studied intensively in the area of computer-aided design (CAD), the minimization of multiple regular expressions has been largely neglected. In the first part of this dissertation, we present a novel sharing architecture allowing our algorithm to extract and share common sub-regular expressions.
    On the other hand, the memory architecture is also widely adopted by NIDS because of the advantages of easy re-configurability and scalability. Due to the increasing number of attacks, the required memory increases tremendously. Because the performance, cost, and power consumption of the memory architecture are directly related to the memory size, reducing the memory size has become imperative. In the second part of this dissertation, we propose a memory-efficient pattern-matching algorithm which can significantly reduce the memory requirement for the memory architecture.
    However, the memory architecture suffers the problem of memory explosion caused by certain types of regular expressions. It is well known that the number of states and transitions of a DFA can be exponential to the size of its corresponding regular expression. Implementing such regular expression pattern leads to extremely large memory requirements for storing the corresponding state transition table. In the third part of this dissertation, we propose a novel memory architecture which inserts marginal logic elements to improve the ability of traditional memory architecture to deal with complex regular expressions.

    Content Abstract…………………………………………………………………2 Content …………………………………………………………………4 List of Figures ………………………………………………………6 List of Tables…………………………………………………………8 Chapter 1 Introduction………………………………………………9 Chapter 2 Backgrounds………………………………………………13 2.1 Related Works……………………………………………………13 2.2 Regular Expression Patterns…………………………………14 Chapter 3 Optimization of Pattern Matching Circuits for Regular Expression on FPGA ………………………………………17 3.1 Introduction ……………………………………………………17 3.2 Regular Expressions for Attacks’ Description…………20 3.3 Minimization of Regular Expression Circuits……………21 3.3.1 Sharing Common Suffixes……………………………………21 3.3.2 Novel Sharing Architecture ………………………………22 3.3.3 Critical-Section Problem in Sharing Architecture …24 3.4 Regular Expression to NFA Hardware Implementation……26 3.5 Regular Expression Module Generator………………………30 3.6 Experimental Results …………………………………………32 3.7 Summary……………………………………………………………35 Chapter 4 Optimization of Pattern Matching Algorithm for Memory Based Architecture…………………………………………36 4.1 Introduction ……………………………………………………36 4.2 Review of Aho-Corasick Algorithm …………………………38 4.3 Basic Idea ………………………………………………………40 4.4 State Traversal Mechanism on a merg_FSM…………………42 4.5 Construction of State Traversal Machine…………………47 4.6 Cycle Problems when Merging Multiple Sections of Pseudo-Equivalent States……………………………………………………50 4.7 Integration with Bit-split Algorithm ……………………51 4.8 Experimental Results …………………………………………53 4.9 Summary……………………………………………………………57 Chapter 5 Novel Memory Architecture for Regular Expression Matching ………………………………………………………………58 5.1 Introduction ……………………………………………………58 5.2 Complex Regular Expression Patterns………………………60 5.3 Logic Architecture for Complex Regular Expression Patterns ………………………………………………………………61 5.4 Novel Memory Architecture……………………………………62 5.5 Experimental Results …………………………………………67 5.6 Summary……………………………………………………………70 Chapter 6 Conclusions………………………………………………71 Reference………………………………………………………………73

    [1] R. Sidhu and V. K. Prasanna, “Fast regular expression matching using FPGAs,” in Proc. 9th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2001, pp. 227-238.
    [2] B.L. Hutchings, R. Franklin, and D. Carver, “Assisting Network Intrusion Detection with Reconfigurable Hardware,” in Proc.10th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2002, pp. 111-120.
    [3] C. R. Clark and D. E. Schimmel, “Scalable Parallel Pattern Matching for High Speed Networks,” in Proc. 12th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2004, pp. 249-257.
    [4] Y. H. Cho, S. Navab, and W. H. Mangione-Smith, “Specialized Hardware for Deep Network Packet Filtering,” in Proc. 10th Ann. ACM/SIGDA Int. Conf. Field-Program. Logic Appl. (FPL), 2002, pp. 452-461.
    [5] Y. H. Cho and W. H. Mangione-Smith, “A Pattern Matching Co-processor for Network Security,” in Proc. 42nd Des. Autom. Conf. (DAC), 2005, pp. 234-239.
    [6] M. Aldwairi*, T. Conte, and P. Franzon, “Configurable String Matching Hardware for Speeding up Intrusion Detection,” in ACM SIGARCH Computer Architecture News, 2005, pp. 99–107.
    [7] S. Dharmapurikar and J. Lockwood, “Fast and Scalable Pattern Matching for Content Filtering,” in Proc. of Symp. Architectures Netw. Commun. Syst. (ANCS), 2005, pp. 183-192.
    [8] L. Tan and T. Sherwood, “A high throughput string matching architecture for intrusion detection and prevention,” in 32nd Ann. Int. Symp. on Comp. Architecture, (ISCA), 2005, pp. 112-122.
    [9] H. J. Jung, Z. K. Baker, and V. K. Prasanna, “Performance of FPGA Implementation of Bit-split Architecture for Intrusion Detection Systems,” in 20th Int. Parallel and Distributed Processing Symp. (IPDPS), 2006.
    [10] Y. Cho and W. H. Mangione-Smith, “Deep Packet Filter with Dedicated Logic and Read Only Memories,” in Proc. 12th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2004, pp.125-134.
    [11] Z. K. Baker and V. K. Prasanna, “A Methodology for the Synthesis of Efficient Intrusion Detection Systems on FPGAs,” in Proc. 12th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2004, pp.135-144.
    [12] Z. K. Baker and V. K. Prasanna, “High-throughput Linked-Pattern Matching for Intrusion Detection System,” in Proc. Symp. Architecture Netw. Commun. Syst. (ANCS), 2005, pp.193-202.
    [13] J. Moscola, Y. H. Cho, and J. W. Lockwood, “Implementation of Network Application Layer Parser for Multiple TCP/IP Flows in Reconfigurable Devices,” in Proc. 16th Ann. ACM/SIGDA Int. Conf. Field-Program. Logic Appl. (FPL), 2006, pp.1-4.
    [14] I. Sourdis and D. Pnevmatikatos, “Fast, Large-Scale String Match for a 10Gbps FPGA-based Network Intrusion Detection System,” in Proc. 11th Ann. ACM/SIGDA Int. Conf. Field-Program. Logic Appl. (FPL), 2003, pp.880-889.
    [15] J. Moscola, J. Lockwood, R. P. Loui and M. Pachos, “Implementation of a Content-Scanning Module for an Internet Firewall,” in Proc. 11th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2003, pp.31-38.
    [16] Z. K. Baker and V. K. Prasanna, “Time and area efficient pattern matching on FPGAs,” in Proc. ACM/SIGDA 12th Int. Symp. Field-Program. Gate Arrays, 2004, pp. 223-232.
    [17] M. Gokhale, D. Dubois, A. Dubois, M. Boorman, S. Poole, and V. Hogsett. Granidt, “Towards Gigabit Rate Network Intrusion Detection,” in Proc. 12th Ann. ACM/SIGDA Int. Conf. Field-Program. Logic Appl. (FPL), 2002, pp.404-413.
    [18] I. Sourdis and D. Pnevmatikatos, “Pre-decoded CAMs for Efficient and High-Speed NIDS Pattern Matching,” in Proc. 12th Ann. IEEE Symp. Field-Program. Custom Comput. Mach. (FCCM), 2004, pp.258-267.
    [19] F. Yu, R. H. Katz, and T. V. Lakshman, “Gigabit Rate Packet Pattern-Matching Using TCAM,” in Proc. 12th IEEE Int. Conf. Netw. Protocols (ICNP.), 2004, pp.174-183.
    [20] S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. Lockwood, “Deep packet inspection using parallel bloom filters,” in Proc. 11th Symp. High Performance Interconnects, 2003, pp.44-53.
    [21] J. W. Lockwood, J. Moscola, M. Kulig, D. Reddick, and T. Brooks, "Internet worm and virus protection in dynamically reconfigurable hardware," in Proc. Military Aerosp. Program. Logic Device (MAPLD), 2003, p. E10.
    [22] M. Roesch, “Snort- lightweight Intrusion Detection for networks,” in Proc. 15th Syst. Administration Conf. (LISA), 1999, pp.229-238.
    [23] A. V. Aho and M. J. Corasick, ”Efficient String Matching: An Aid to Bibliographic Search,” in Communications of the ACM, 1975, pp.333-340.
    [24] Bro official website, http://www.bro-ids.org/.
    [25] ClamAV official website, http://www.clamav.net/.
    [26] F. Yu, Z. Chen, Y.Diao, T.V. Lakshman, and R.H. Katz, “Fast and Memory-Efficient Regular Expression Matching for Deep packet Inspection,” in Proc. ACM/IEEE Symp. Architectures Netw. Commun. Syst. (ANCS), 2006, pp. 93-102.
    [27] Z. K. Baker, H. J. Jung, and V. K. Prasanna, “Regular Expression Software Deceleration for Intrusion Detection systems,” in Proc. International Conference on Field Programmable Logic and Applications (FPL), 2006, pp. 1-8.
    [28] I. Sourdis, J.Bispo, J. M.P. Cardoso, and S. Vassiliadis, “Regular expression matching for reconfigurable packet inspection”, in Proc. IEEE International conference on Field Programmable Technology (FPT), 2006, pp.119-126.
    [29] B. Brodie, R. Cytron, D.Taylor, “A Scalable Architecture for High-throughput Regular Expression Matching,” in Proc. 33rd Int’l Symposium on Computer Architecture (ISCA), 2006, pp191-202.
    [30] S. Kumar, S.Dharmapurikar, F.Yu, P. Crowley, and J. Turner, “Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection,” in ACM SIGCOMM Computer Communication Review, ACM Press, vol.36, Issue. 4, Oct. 2006, pp. 339-350.
    [31] J. Moscola, Y. H. Cho, J. W. Lockwood, “A Scalable hybrid regular expression pattern matcher,” in Proc. 14th IEEE Symposium on Field Programmable Custom Computing Machines (FCCM), 2006, pp.337-338.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE