簡易檢索 / 詳目顯示

研究生: 陳信勳
Chen, Shin-Shiun
論文名稱: Advanced Memory-Processor Stacking Architecture for High Performance and Low Power
高效能及低功耗之先進記憶體與處理器堆疊架構
指導教授: 吳誠文
Wu, Cheng-Wen
口試委員: 蘇朝琴
周世傑
黃錫瑜
吳誠文
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 58
中文關鍵詞: 三維堆疊動態隨機存取記憶體介面架構電子化系統層級
外文關鍵詞: 3-D Stacking, DRAM, Interface, Architecture, ESL
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這篇論文中,我們利用電子化系統層級 (ESL) 的方法以及模擬,在架構層級提出了一個基於矽穿孔 (Through Silicon Via, TSV) 三維堆疊技術的處理器及記憶體堆疊架構。我們的架構移除了傳統的快取記憶體,因此不僅可以減少成本並且可以減少能量消耗,非常適合重視功率消耗的嵌入式系統以及手持式裝置。我們利用了矽穿孔技術所提供的大量輸出入頻寬,加上因為三維堆疊技術而改善的處理器跟記憶體之間的速度差距, 讓我們可以藉由重新設計三維堆疊動態存取記憶體(DRAM)的內部架構以及介面協定,來達到簡化記憶體控制器以及減少更多能源及成本消耗的目的。

    我們使用現有商用之DRAM來建構出我們的時間以及功率的模型以利我們做架構驗證。功率部分我們利用DDR2來當作參考模型,時間的部分我們則利用Fast-Cycle RAM (FCRAM) 來當作我們的參考模型。至於模擬整體系統的行為我們則利用ESL方法所建構的虛擬平台,另外加上cycle-accurate的處理器模型來進行我們的架構分析以及驗證。從實驗結果可以發現,我們所提出來的架構對比於原來的2D設計,系統效能可以提升23.5%,但是能源的消耗量卻只有原來的20%。


    1 Introduction 1.1 Motivation 1.2 Three Dimensional (3-D) Integration Technology 1.2.1 Wire Bonding Technique 1.2.2 Micro-Bump Technique 1.2.3 Through Silicon Via (TSV) 1.3 Modern DRAM Architecture 1.3.1 Commodity DRAM 1.3.2 Fast-Cycle RAM (FCRAM) 1.3.3 Reduced-Latency DRAM (RLDRAM) 1.4 DRAM-Processor-Stacked Architecture 1.5 Electronic System Level (ESL) Technique 1.6 Thesis Organization 2 Architecture Evaluation 2.1 PAC Duo SOC 2.2 System Analysis 2.2.1 2-D Version 2.2.2 3-D Version 2.3 Simulation Schemes with Virtual Platform 2.3.1 TSV Channel Modeling 2.3.2 DRAM Modeling 3 Stacked-DRAM Architecture 3.1 Sans-Cache DRAM Architecture 3.1.1 Eliminate Address Multiplexing 3.1.2 Small Active Row Size 3.1.3 Wide I/O Design 3.1.4 A Scheme without DDR 3.1.5 Auto-Precharging Scheme 3.2 Timing Model 3.3 Power Model 3.3.1 DDR2 Current Profiles 3.3.2 Power Model of SCDRAM 4 Experimental Results 4.1 Experiment Environment 4.2 Architecture Comparisons 4.3 Benchmark Comparisons 4.4 Discussion 5 Conclusions and Future Work 5.1 Conclusions 5.2 Future Work 5.2.1 Memory Controller Design for SCDRAM 5.2.2 Many-Core System

    [1] W. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A. Sule, M. Steer, and P. Franzon, “Demystifying 3D ICs: the pros and cons of going vertical,” Design Test of Computers, IEEE, vol. 22, no. 6, pp. 498 – 510, 2005.
    [2] 512Mb DDR2 DRAM (MT47H32M16), Micron Technologies, 2004. [Online]. Available: http://download.micron.com/pdf/datasheets/dram/ddr2/512MbDDR2.pdf
    [3] How to use DDR2 SDRAM, Elpida Memory, Inc., 2007. [Online]. Available: http://www.elpida.com/pdfs/E0437E40.pdf
    [4] E. Cooper-Balis and B. Jacob, “Fine-grained activation for power reduction in DRAM,” Mi¬cro, IEEE, vol. 30, no. 3, pp. 34 –47, may-june 2010.
    [5] 256Mb RLDRAM DRAM (MT49H8M32), Micron Technologies, 2002. [Online]. Available: http://download.micron.com/pdf/datasheets/rldram/MT49H8M32.pdf
    [6] T.-J. Lin, C.-N. Liu, S.-Y. Tseng, Y.-H. Chu, and A.-Y. Wu, “Overview of ITRI PAC project -from VLIW DSP processor to multicore computing platform,” in VLSI Design, Automation and Test, 2008. VLSI-DAT 2008. IEEE International Symposium on, 2008, pp. 188 –191.
    [7] “Calculating memory system power for DDR2,” Micron Technologies, Tech. Rep. TN-47-04, 2005. [Online]. Available: http://download.micron.com/pdf/technotes/ddr2/tn4704.pdf
    [8] 1Gb DDR2 DRAM (MT47H256M4), Micron Technologies, 2004. [Online]. Available: http://download.micron.com/pdf/datasheets/dram/ddr2/1GbDDR2.pdf
    [9] P. Jacob, A. Zia, O. Erdogan, P. Belemjian, J.-W. Kim, M. Chu, R. Kraft, J. McDonald, and
    K. Bernstein, “Mitigating memory wall effects in high-clock-rate and multicore CMOS 3-D processor memory stacks,” Proceedings of the IEEE, vol. 97, no. 1, pp. 108 –122, 2009.
    [10] W. A. Wulf and S. A. Mckee, “Hitting the memory wall: Implications of the obvious,” Com¬puter Architecture News, vol. 23, pp. 20–24, 1995.
    [11] J.-Y. Sim, S.-K. Lee, Y.-S. Kim, Y.-S. Sohn, and J. S. Choi, “High-speed links for memory interface,” in 2010 IEEE International Conference on IC Design and Technology (ICICDT), 2010, pp. 16 –19.
    [12] Y. Xie, “Processor architecture design using 3D integration technology,” in Proc. VLSI De¬sign, 2010. VLSID ’10. 23rd International Conference on, 2010, pp. 446 –451.
    [13] U. Kang, H.-J. Chung, S. Heo, S.-H. Ahn, H. Lee, S.-H. Cha, J. Ahn, D. Kwon, J. H. Kim, J.¬
    W. Lee, H.-S. Joo, W.-S. Kim, H.-K. Kim, E.-M. Lee, S.-R. Kim, K.-H. Ma, D.-H. Jang, N.-S. Kim, M.-S. Choi, S.-J. Oh, J.-B. Lee, T.-K. Jung, J.-H. Yoo, and C. Kim, “8Gb 3D DDR3 DRAM using through-silicon-via technology,” in Solid-State Circuits Conference -Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, feb. 2009, pp. 130 –131,131a.
    [14] Y. Deng and W. Maly, “2.5D system integration: a design driven system implementation schema,” in Design Automation Conference, 2004. Proceedings of the ASP-DAC 2004. Asia and South Pacific, 2004, pp. 450 – 455.
    [15] B. Black, M. Annavaram, N. Brekelbaum, J. DeVale, L. Jiang, G. H. Loh, D. McCaule,
    P. Morrow, D. W. Nelson, D. Pantuso, P. Reed, J. Rupley, S. Shankar, J. Shen, and C. Webb, “Die stacking (3D) microarchitecture,” in Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, 2006, pp. 469 –479.
    [16] G. H. Loh, Y. Xie, and B. Black, “Processor design in 3D die-stacking technologies,” Micro, IEEE, vol. 27, no. 3, pp. 31 –48, 2007.
    [17] Y. Sato, T. Suzuki, T. Aikawa, S. Fujioka, W. Fujieda, H. Kobayashi, H. Ikeda, T. Nagasawa,
    A. Funyu, Y. Fuji, K. Kawasaki, M. Yamazaki, and M. Taguchi, “Fast cycle RAM (FCRAM); a 20-ns random row access, pipe-lined operating DRAM,” in VLSI Circuits, 1998. Digest of Technical Papers. 1998 Symposium on, June 1998, pp. 22 –25.
    [18] “Exploring the RLDRAM II feature set,” Micron Technologies, Tech. Rep., 2004. [Online]. Available: http://download.micron.com/pdf/technotes/RLDRAMII/TN4902.pdf
    [19] T. Mudge, “Picoserver -building a compact energy efficient multiprocessor,” in Embedded Computer Systems: Architectures, Modeling, and Simulation, 2008. SAMOS 2008. Interna¬tional Conference on, 2008, p. i.
    [20] D. H. Woo, N. H. Seong, D. Lewis, and H.-H. Lee, “An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth,” in High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, 2010, pp. 1 –12.
    [21] Y. Pan and T. Zhang, “Improving VLIW processor performance using three-dimensional (3D) dram stacking,” in Application-specific Systems, Architectures and Processors, 2009. ASAP 2009. 20th IEEE International Conference on, 2009, pp. 38 –45.
    [22] G. Loh, “3D-stacked memory architectures for multi-core processors,” in Computer Archi-tecture, 2008. ISCA ’08. 35th International Symposium on, 2008, pp. 453 –464.
    [23] I.-Y. Chuang, C.-W. Chang, T.-Y. Fan, J.-C. Yeh, K.-M. Ji, J.-L. Ma, A.-Y. Wu, and S.-Y. Lin, “PAC Duo SoC performance analysis with ESL design methodology,” in ASIC, 2009. ASICON ’09. IEEE 8th International Conference on, 2009, pp. 399 –402.
    [24] Open SystemC Iniative (OSCI), “SystemC Standards.” [Online]. Available: http://www.systemc.org
    [25] 256Mb FCRAM (TC59LM814/06CFT), TOSHIBA CORPORATION, 2002. [Online]. Available: http://www.datasheetcatalog.org/datasheet/toshiba/1734.pdf
    [26] N. Muralimanohart, R. Balasubramonian, and N. Jouppi, “Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0,” in Microarchitecture, 2007. MICRO 2007. 40th Annual IEEE/ACM International Symposium on, 2007, pp. 3 –14.
    [27] N. Muralimanohar, R. Balasubramonian, and N. Jouppi, “Architecting efficient interconnects for large caches with CACTI 6.0,” Micro, IEEE, vol. 28, no. 1, pp. 69 –79, 2008. [Online]. Available: http://www.hpl.hp.com/research/cacti/
    [28] D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel, and B. L. Jacob, “DRAMsim: a memory system simulator.” SIGARCH Computer Architecture News.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE