研究生: |
黎晉丞 Li, Jing Cheng |
---|---|
論文名稱: |
解決探聽過濾器過時化問題的高效架構 An Efficient Architecture for Resolving the Aging Problem of Snoop Filter |
指導教授: |
張世杰
Chang, Shih Chieh |
口試委員: |
金仲達
King, Chung Ta 鍾文邦 Jone, Wen Ben |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 42 |
中文關鍵詞: | 探聽式一致性協定 、探聽式過濾器 、過濾器復興 |
外文關鍵詞: | Snoop-based coherence protocol, Snoop filter, Filter rejuvenation |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
快取一致性(Cache coherence)是指保留在快取記憶體(Cache)中的共享資源必須保持資料一致性的機制。其中,探聽式一致性協定由於其簡單的特性在多系統晶片應用非常普遍。快取控制器(Cache controller)會藉由對快取中快取塊(Cache line)相對應的標籤(Cache tag)進行快取標籤查詢(Cache tag lookup)來決定一筆資料是否存在快取中來回應每筆探聽式要求(Snoop request)。根據以往的研究表示,由於共享資源在不同的端點之間數量是有限的,約90%的探聽式要求是多餘的。這些多餘的要求會因為使快取控制器進行快取標籤查詢而浪費系統的能源。因此,探聽式過濾器(Snoop filter)就是被提出應用在篩選出無用的探聽式要求。探聽過濾器必須將所有快取讀取過的資料的位置(Address)壓縮進過濾器中。由於壓縮的特性,探聽式過濾器可能會做出錯誤的篩選又稱為假陽性(False positive)。所謂假陽性要求是指通過了過濾器並進入到快取中進行快取標籤查詢,才發現這是一筆多餘的要求。然而隨著時間,在過濾器中大量的壓縮資料會導致過濾器產生假陽性的篩選機率變高。所以一個低效率的過時化過濾器會導致許多浪費的標籤查詢。
為了解決低效率的過時化過濾器所導致的問題,IBM提出了一個使過濾器更新的方法,並提出更新的時機點為發生快取掩蓋時(Cache wrap)。如果發生快取掩蓋的時機點太長,過濾器就會開始降低效率,甚至在過濾器更新後不能達到更新的目的。我們發現在一些應用(SPLASH 2)中,快取掩蓋發生的時機點很長,同時過濾器產生假陽性的篩選機率會升高。因此在這篇論文中,我們專注在如何更新一個發生過時化的過濾器而不是在如何設計一個過濾器上。我們提出我們的過濾器復興技術 (Filter rejuvenation technique) 來解決低效率的過時化過濾器所導致的問題。
Snoop-based coherence protocol is very popular in multiprocessor systems because of its simplicity. In a snoop-based, many cache tag lookups are needed for snoop requests. However, it has been shown about 90% snoop requests are useless and therefore cache lookups are redundant. To reduce unnecessary cache lookups, the snoop filter scheme was proposed. However, it is known that the efficiency of a snoop filter decreases with time. In other words, an aging filter cannot filter out unnecessary requests. To solve the problem of an aging snoop filter, [8] has proposed a novel way to rejuvenate an aging snoop filter so that an aging filter can be refreshed to have high efficiency again. We observe that in several real designs, [8] fail to achieve effective rejuvenation. In this paper, we focus on how to rejuvenate a snoop filter design rather than to design the snoop filter itself. We propose a novel way of rejuvenating an aging snoop filter by four filter rejuvenation techniques. Our experimental results show that the proposed techniques, when works together, reduce the number of unnecessary requests to 62.23% and the energy consumption to 67.58% averagely. For the best case, we approximately reduce the number to 30% compared to [8].
REFERENCES
[1] E. Atoofian and A. Baniasadi, “Using supplier locality in poweraware interconnects and caches in chip multiprocessors,” J. Systems Architecture 54(5): 507-518, 2008.
[2] E. Atoofian, A. Baniasadi and K. Aasaraai, “Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols,” CF 2007: 259-266.
[3] M. Blumrich, V. Salapura and A. Gara, “Exploring the architecture of a stream register-based snoop filter,” 2011.
[4] A. Moshovos, “RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence,”
[5] A. Moshovos, G. Memik, B. Falsafi and A. Choudhary, “JETTY Filtering Snoops for Reduced Energy Consumption in SMP Servers,” HPCA, 2001.
[6] J. Nilsson, A. Landin and Per Stenstrom, “The Coherence Predictor Cache: A Resource-Efficient and Accurate Coherence Prediction Infrastructure,” IPDPS, 2003.
[7] J. Renau et al. SESC simulator, January 2005. http://sesc.sourceforge.net.
[8] V. Salapura, M. A. Blumrich and A. Gara, “Design and implementation of the Blue Gene/P snoop filter,” HPCA, 2008.
[9] V. Salapura, M. Blumrich and A. Gara, “Improving the accuracy of snoop filtering using stream registers,” MEDEA, 2007.
[10] J. Singh, W.-D. Weber, and A. G. Splash, “Stanford parallel applications for shared memory. Computer Architecture News,” 1992.
[11] D. Tarjan, S. Thoziyoor and N. P. Jouppi, “Cacti 4.0. Technical report,” Compaq Research Lab, 2006.
[12] R. Ulfsnes, “A survey of low power design techniques for cache coherency in multiprocessor memory systems,” Semester project NTNU, 2012.
[13] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations,” in 22nd International Symposium on Computer Architecture (ISCA), 1995.
[14] D. H. Woo, M. Ghosh, E. Ozer, S. Biles and H.-H. S. Lee, “Reducing Energy of Virtual Cache Synonym Lookup using Bloom Filters,” CASES, 2006.