研究生: |
卓思辰 Cho, Szu Chen |
---|---|
論文名稱: |
多核心系統中可自動調整的高效率竊聽過濾器 An Efficient Snoop Filter with Adaptive Mechanism in Multiprocessor Systems |
指導教授: |
張世杰
Chang, Shih Chieh |
口試委員: |
陳添福
Chen, Tien Fu 林政宏 Lin, Cheng Hung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2016 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 24 |
中文關鍵詞: | 竊聽過濾器 、快取一致性 、第一型錯誤預測 、過濾比率 |
外文關鍵詞: | snoop filter, cache coherence, false positive prediction, filter rate |
相關次數: | 點閱:96 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在現代計算機系統和多核心系統中,為了確保系統中所有核心資料的一致,快取記憶體的一致性是非常重要的。維持快取記憶體一致性的最簡單的方式就是廣播一個竊聽需求給所有核心的快取記憶體,然後每個核心收到需求後,會執行一次快取標籤比對來確定這筆資料是否在這個快取記憶體中。因為大部分竊聽需求的快取標籤比對都會沒比對成功,進而變成電量的消耗,所以竊聽過濾器已經被廣泛的使用,藉由把不需要的快取標籤比對濾掉來減少電量的消耗。
然而,竊聽過濾器也同樣面臨著相同的問題──第一型錯誤(false positive)預測所產生的大量電量消耗。基本上,竊聽過濾器的設計必須在過濾比率和硬體成本間進行考量。傳統上,竊聽過濾器比率會隨著記憶體容量增大而增加,但是記憶體的增加會造成硬體的負擔。為解決上述問題,本研究提出一有效的動態適應技術,使竊聽過濾器能在使用較少的記憶體下提高過濾率。實驗的結果顯示,本研究的動態適應技術能分別提高19.17%的過濾率與減少76.1%的記憶體使用率。
Cache coherence is very important in modern heterogeneous computing systems as well as multicore systems to ensure all processors, or bus masters in the system maintain the consistent data in their local cache memory. The simplest mechanism of cache coherency is to broadcast a snoop request to all processor caches and then each cache receiving the snoop request performs a cache tag lookup to determine whether it has the data. Because for most workloads the majority of cache tag lookups performed as a result of snoop requests will miss and waste power, Snoop filters have been widely adopted to reduce power consumption by filtering out unnecessary cache tag lookup for cache coherence.
However, snoop filters also suffer the same problem that huge amount of power is consumed by false positive predictions of snoop filters. Substantially, designing an efficient snoop filter has to make tradeoff decisions between filter rate and hardware overhead. Traditionally, snoop filter rate can be enhanced by increasing memory capacity, but the burden caused by the hardware overhead. In this paper, we propose an efficient adaptive mechanism which can be applied to snoop filters to improve snoop filter rate with low hardware overhead. The basic idea of the adaptive mechanism is to duplicate multiple copies of small snoop filters and distribute cache tags evenly to the duplicated copies according to the analytics of operating systems. The adaptive mechanism can effectively improve filter rate by reducing false positive predictions. Experimental results show that applying the adaptive mechanism to JETTY snoop filters achieves an average of 19.17% and 76.1% improvement on filter rate and memory reduction, respectively for Splash 2 benchmarks.
[1] M. Blumrich, V. Salapura and A. Gara, “Exploring the architecture of a stream register-based snoop filter.” Transactions on high-performance embedded architectures and compilers III, pp. 93-114, 2011.
[2] A. Moshovos, G. Memik, B. Falsafi and A. Choudhary, “JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers.” Paper presented at The Seventh International Symposium on High-Performance Computer Architecture, pp. 85-96, 2001.
[3] J. Renau et al. SESC simulator, January 2005. Retrieved from http://sesc.sourceforge.net.
[4] V. Salapura, M. Blumrich and A. Gara, “Improving the accuracy of snoop filtering using stream registers.” Paper presented at Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, pp. 25-32, 2007.
[5] V. Salapura, M. A. Blumrich and A. Gara, “Design and implementation of the Blue Gene/P snoop filter.” Paper presented at 2008 IEEE 14th International Symposium on High Performance Computer Architecture, pp. 5-14, 2008.
[6] J. P. Singh, W. D. Weber, and A. G. Splash, “SPLASH: Stanford parallel applications for shared memory.” ACM SIGARCH Computer Architecture News, 20 (1): pp. 5-44, 1992.
[7] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations.” ACM SIGARCH Computer Architecture News, 23 (2): pp. 24-36, 1995.
[8] D. H. Woo, M. Ghosh, E. Ozer, S. Biles and H. H. S. Lee, “Reducing Energy of Virtual Cache Synonym Lookup using Bloom Filters.” Paper presented at Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, pp. 179-189, 2006.