應用於非揮發性記憶體內運算巨集之高能源效率循序漸近式參考電流產生器

簡易檢索 / 詳目顯示

回結果列表

研究生：	劉哲旭 Liu, Je-Syu
論文名稱：	應用於非揮發性記憶體內運算巨集之高能源效率循序漸近式參考電流產生器 An Energy-Efficient Successive Approximation Reference Current Generator for Non-volatile Computing-In-Memory Macro
指導教授：	張孟凡 Chang, Meng-Fan
口試委員:	洪浩喬謝志成邱瀝毅
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2019
畢業學年度：	108
語文別：	英文
論文頁數：	41
中文關鍵詞：	劉哲旭、非揮發性記憶體、記憶體內運算、參考電流產生器
外文關鍵詞：	Je-Syu, Liu, Non-volatile memory, Computing-in-memory, reference current generator
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著深度類神經網路與人工智慧的崛起，在行動裝置上實現高速運算及低功耗成為主要的需求，因此邊緣運算應運而生，由於其可利用訓練好的模型在特定排程或時間內完成推論，不同於訓練時所需的巨量運算，能夠減少大量的運算量，而後出現了記憶體內運算(Computing-In-Memory)、近記憶體運算(Near-Memory-Computing)以及記憶體處理器(Processor-In-Memory)等新的概念，透過減少資料搬運過程中消耗的巨大能量，並且同時擁有記憶與運算的功能，記憶體內運算扮演深度類神經網路處理器中硬體加速器的角色。

本篇論文提出一個循序漸近式參考電流產生器，應用在非揮發性記憶體內運算巨集進行乘法與累加運算後實現高輸出精度，同時具有高能源效率且節省面積的優點。透過台積電55奈米CMOS邏輯製程，本篇論文在一個 2Mb電阻式記憶體內運算巨集的6位元及8位元輸出下分別提高了1.38及7.02倍的功績。

With the rapid development of deep neural network (DNN) and artificial intelligence (AI),
mobile devices require high operation speed and low power consumption. Edge computing comes out to meet the requirement since it can use a trained model to complete inference in a specific schedule or time, different from the huge amount of operations required for training. Therefore, some new concepts have emerged such as Computing-In-Memory (CIM), Near-Memory-Computing (NMC) and Processing-In-Memory (PIM). By reducing energy consumption of the data movement, and implementing computations inside memory, CIM become an AI accelerator for DNN processor.

To support high output precision for MAC operations in non-volatile computing-in-memory macro, we propose a Successive Approximation Reference Current Generator (SARCG) which can realize energy-efficiency and area reduction in high output precision. The proposed scheme in a 2Mb ReRAM CIM macro fabricated in TSMC 55nm CMOS logic process achieves 1.38x and 7.02x improvement in figure of merit (FoM) for 6bit and 8bit output respectively.

Contents
摘要...i
Abstract...ii
致謝...iii
Contents...iv
List of Figures...vi
List of Tables...vii
Chapter 1    Introduction...1
1.1 Memory Landscape...1
1.2 von Neumann Bottleneck...4
1.3 Computing-In-Memory (CIM)...5
Chapter 2    Characteristic of Contact-ReRAM...7
2.1 Structure of Contact-ReRAM...7
2.2 Switching Mechanisms...8
2.3 Write Operations...9
2.4 Read Operation...10
Chapter 3    Previous Work...11
3.1 The role of Reference generator in CIM structure...11
3.2 Previous works...11
3.2.1 Input-Aware dynamic IREF generation (IA-REF)...11
3.2.2 Global-VREF-GEN & Local-IREF-SEL scheme...13
Chapter 4    Proposed Circuit Scheme and Analysis...17
4.1 Proposed Successive Approximation Reference Current Generator (SARCG)...17
4.1.1 Motivation and Concept...17
4.1.2 Structure of Proposed SARCG Scheme...20
4.1.3 Operations of Proposed Scheme...21
4.2 Analysis and Comparison...27
4.2.1 Power...27
4.2.2 Area...30
4.2.3 Figure of Merit (FoM)...32
4.3 Summary...33
Chapter 5    Measurement Results and Conclusion...34
5.1 Floor Plan of ReRAM CIM Macro...34
5.2 Design for Test-chip...35
5.3 Measurement results...35
5.4 Conclusions and Future Work...37
Reference...39


                                

[1] E. Seevinck, et al., “Static-noise margin analysis of MOS SRAM cells,” IEEE Journal of Solid-State Circuits (JSSC), pp. 748-754, 1987
[2] Y. J. Kim, et al., “A 16-Gb, 18-Gb/s/pin GDDR6 DRAM With Per-Bit Trainable Single-Ended DFE and PLL-Less Clocking,” IEEE Journal of Solid-State Circuits (JSSC), pp. 197-209, 2019
[3] David E. Taylor, et al., “ClassBench: A Packet Classification Benchmark,” IEEE ACM Transactions on Networking, pp. 499-511, 2007
[4] S. Jeloka, et al., “A configurable TCAM/BCAM/SRAM using 28nm push-rule 6T bit cell,” IEEE Symposium on VLSI Circuits, pp. 272-273, 2015
[5] Alex X. Liu, et al., “Packet Classification Using Binary Content Addressable Memory,” IEEE ACM Transactions on Networking, pp. 1295-1307, 2016
[6] H. K. Cha, et al., “A 32-KB Standard CMOS Antifuse One-Time Programmable ROM Embedded in a 16-bit Microcontroller,” IEEE Journal of Solid-State Circuits (JSSC), pp. 2115-2124, 2006
[7] B. Wang, et al., “Highly Reliable 90-nm Logic Multitime Programmable NVM Cells Using Novel Work-Function-Engineered Tunneling Devices,” IEEE TRANSACTIONS ON ELECTRON DEVICES pp. 2526-2530, 2007
[8] T. L. Lee, et al., “A New Differential P-Channel Logic-Compatible Multiple-Time Programmable (MTP) Memory Cell With Self-Recovery Operation,” IEEE ELECTRON DEVICE LETTERS, pp. 587-589, 2011
[9] P. Pavan, et al., “Flash memory cells-an overview,” Proceedings of the IEEE, pp.1248-1271, 1997
[10] R. Bez, et al., “Introduction to flash memory,” Proceedings of the IEEE, pp.489-502, 2003
[11] J. T. Evans, et al., “An experimental 512-bit nonvolatile memory with ferroelectric storage cell,” IEEE Journal of Solid-State Circuits (JSSC), pp. 1171-1175, 1988
[12] T. P. Ma, et al., “Why is nonvolatile ferroelectric memory field-effect transistor still elusive?” IEEE Electron Device Letters, pp. 386-388, 2002
[13] A. Pirovano, et al., “Electronic switching in phase-change memories,” IEEE Transactions on Electron Devices, pp. 452-459, 2004
[14] H. -S. Philip Wong, et al., “Phase Change Memory,” Proceedings of the IEEE, pp. 2201-2227, 2010
[15] S. Tehrani, et al., “Progress and outlook for MRAM technology,” IEEE Transactions on Magnetics, pp. 2814-2819, 1999
[16] M. Hosomi, et al., “A novel nonvolatile memory with spin torque transfer magnetization switching: spin-ram,” IEEE International Electron Devices Meeting (IEDM), pp. 1-4, 2005
[17] S. Ikeda, et al., “Magnetic Tunnel Junctions for Spintronic Memories and Beyond,” IEEE Transactions on Electron Devices, pp. 991-1002, 2007
[18] H. Akinaga, et al., “Resistive Random Access Memory (ReRAM) Based on Metal Oxides,” Proceedings of the IEEE, pp. 2237-2251, 2010
[19] H. -S. Philip Wong, et al., “Metal–Oxide RRAM,” Proceedings of the IEEE, pp. 1951-1970, 2012
[20] J. von Neumann “First Draft of a Report on the EDVAC,” 1945
[21] J. Backus, ‘‘Can programming be liberated from the von Neumann style?: A functional
style and its algebra of programs,’’ Commun. ACM, vol. 21, no. 8, pp. 613–641, 1978
[22] G. Indiveri, et al., “Memory and Information Processing in Neuromorphic Systems,” Proceedings of the IEEE, pp. 1379-1397, 2015
[23] B. Chen, et al., “Efficient in-memory computing architecture based on crossbar arrays,” IEEE International Electron Devices Meeting (IEDM), pp. 459-462, 2015
[24] S. Li, et al., “Pinatubo: A Processing-in-Memory Architecture for Bulk Bitwise Operations in Emerging Non-volatile Memories,” ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1-6, 2016
[25] M. Price, et al., “A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 244-245, 2017
[26] D. Shin, et al., “DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 240-241, 2017
[27] Z. Lin, et al., “Neural networks with few multiplications,” arXiv: 1510.03009, 2016, https://arxiv.org/abs/1510.03009
[28] P. Chi, et al., “PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory,” Int. Symp. On Comp. Arch., pp. 27-39, 2016
[29] D. Ielmini 2014 Resistive switching memories Wiley Encyclopedia Electrical and Electronic Engineering (EEEE) ed J Webster (John Wiley & Sons, Inc.) ISBN: 0-471-13946-7
[30] Y. H. Tseng, et al., “High Density and Ultra Small Cell Size of Contact ReRAM (CR-RAM) in 90nm CMOS Logic Technology and Circuits,” IEEE International Electron Devices Meeting (IEDM), pp. 1-4, 2009
[31] C. Cagli, et al., “Evidence for threshold switching in the set process of NiO-based RRAM and physical modeling for set, reset, retention and disturb prediction,” IEEE International Electron Devices Meeting (IEDM), pp. 1-4, 2008
[32] W. -H. Chen, et al., “A 65nm 1Mb Nonvolatile Computing-in-Memory ReRAM Macro with Sub-16ns Multiply-and-Accumulate for Binary DNN AI Edge Processors,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 494-495, 2018
[33] C. -X. Xue, et al., “A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN-Based AI Edge Processors,” IEEE International Solid-State Circuits Conference (ISSCC) Dig. Tech. Papers, pp. 388-389, 2019
[34] J. Yang, et al., “Sandwich-RAM: An Energy-Efficient In-Memory BWN Architecture with Pulse-Width Modulation,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 394-395, 2019
[35] X. Si, et al., “A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 396-397, 2019
[36] A. Biswas, et al., “Conv-RAM: An Energy-Efficient SRAM with Embedded Convolution Computation for Low-Power CNN-Based Machine Learning Applications,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 488-489, 2018
[37] W. S. Khwa, et al., et al., “A 65nm 4Kb Algorithm-Dependent Computing-in-Memory SRAM Unit-Macro with 2.3ns and 55.8TOPS/W Fully Parallel Product-Sum Operation for Binary DNN Edge Processors,” IEEE International Solid-State Circuits Conference (ISSCC), pp. 496-497, 2018

簡易檢索 / 詳目顯示

相關論文