研究生: |
陳奕儒 Chen, Yi-Ju |
---|---|
論文名稱: |
使用分層位元線轉向控制方案之6T轉置靜態隨機存取記憶體 A 6T Transpose SRAM using Hierarchical-Bitline-Steering Control Scheme |
指導教授: |
張孟凡
Chang, Meng-Fan |
口試委員: |
許世玄
Sheu, Shyh-Shyuan 呂仁碩 Liu, Ren-Shuo |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2017 |
畢業學年度: | 106 |
語文別: | 英文 |
論文頁數: | 59 |
中文關鍵詞: | 靜態隨機存取記憶體 、轉置 、分層位元線 |
外文關鍵詞: | Static Random Access Memory, Transpose, Hierarchical-Bitline |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
新興的人工智慧類神經網路演算法使特殊記憶體和近記憶體計算、記憶體內計算設計成為吸引人的研究主題。這類的特殊記憶體通常在系統中扮演硬體加速器的角色,為系統帶來更高性能或更低功耗。
轉置存取記憶體是特殊記憶體的一類,並且在電腦矩陣運算,二維影像處理以及人工智慧類神經網路中被廣泛應用。轉置存取記憶體能夠以不同的方向讀寫存儲的資料,從而降低了上述應用的設計和操作複雜性。
在此作品,我們提出了一使用分層位元線轉向控制方案之6T轉置靜態隨機存取記憶體,通過使用區域I / O電路實現轉置讀取和寫入,將局部位元線引導連接至不同方向的全域位元線。此6T轉置靜態隨機存取記憶體提供了首次能與邏輯記憶體製程相容的解決方案,這使得我們的設計可以實現比其他需要客制化記憶體單元更小的面積,且能保證高良率。
透過六十五奈米互補式金氧半邏輯製程技術,建構出一容量為六十四千字元之分層位元線轉向控制方案之6T轉置靜態隨機存取記憶體。藉由量測,此晶片可於一伏特之操作電壓下達到0.8奈秒的讀取速度,且操作電壓最低可達到四百毫伏特。這個作品之品質因數:(操作速度)除以(巨集面積乘以消耗能量) 達到其他轉置存取記憶體之三倍以上。
Recent trends on emerging AI neural network algorithms make special memory and near/in-memory computing designs become attractive research topic. This kind of feature memories acts as hardware accelerator and give high performance or low energy feature to the system.
Transpose memory is one category of feature memories and has been widely employed in matrix multiplication in computer graphics, data processing algorithm in 2D image processing and AI neural network thanks to its ability to access stored data in orthogonal direction to reduce design and operation complexity of the above mentioned application.
In this work, we propose a 6T transpose static random access memory using hierarchical-bitline-steering control scheme (HBS-TRAM). Proposed work is based on hierarchical bit-line structure, which achieves transpose read and write by using local I/O circuits to guide the local bit-lines to the global bit-lines in different directions. The HBS-TRAM 6T cell provides, for the first time, the solution to use the same footprint as foundry compact cell. This enables our transpose SRAM design to achieve a most small area than other customized SRAM cells without using aggressive layout rules for SRAM, guaranteeing small-area and high-yield on memory cells.
A fabricated 65nm CMOS logic process 64Kb HBS-TRAM achieved 0.8ns access time at VDD=1.0V by oscilloscope testing. And working supply voltage can down to 0.4V. The figure of merit (FOM): [Speed]/[Macro-Area* Energy] is 3.1+x higher than that of other transpose SRAM design.
[1] H. Qin, et al., “SRAM leakage suppression by minimizing standby supply voltage,” in IEEE International Symposium on Quality Electronic Design, pp. 55-60, 2004.
[2] K. Nii, et al., “A Low Power SRAM using Auto-Backgate-Controlled MT-CMOS,”in IEEE International Symposium on Low Power Electronics and Design, pp. 293-298, Aug. 1998.
[3] C. Morishima, et al., “A 1-V 20-ns 512-Kbit MT-CMOS SRAM with Auto-Power-Cut Scheme Using Dummy Memory Cells,”in IEEE European Solid-State Circuit Conference , pp. 452-455, Sept. 1998.
[4] F. Hamzaoglu, et al., “A 1 Gb 2 GHz 128 GB/s Bandwidth Embedded DRAM in 22 nm Tri-Gate CMOS Technology,”in IEEE Journal of Solid-State Circuits, vol. 50, Issue 1, pp. 150-157, Sept. 2014.
[5] K. C. Chun, et al., “A 700MHz 2T1C Embedded DRAM Macro in a Generic Logic Process with No Bossted Supplies,”in IEEE International Solid-State Circuits Conference Digest of Technical Papers, pp. 506-507, Feb. 2011.
[6] J. Li, et al., “1 Mb 0.41 µm² 2T-2R Cell Nonvolatile TCAM With Two-Bit Encoding and Clocked Self-Referenced Sensing,”in IEEE Journal of Solid-State Circuits, vol. 49, Issue 4, pp. 896-907, April. 2013.
[7] M. F. Chang, et al., “A 3T1R Nonvolatile TCAM Using MLC ReRAM with Sub-1ns Search Time,”in IEEE International Solid-State Circuits Conference Digest of Technical Papers, pp. 1-3, Feb. 2015.
[8] D. Smith, et al., “A 3.6ns 1Kb ECL I/O BiCMOS U.V. EPROM,”in IEEE International Symposium on Circuits and Systems, vol. 3, pp. 1987-1990, May 1990.
[9] C. Kuo, et al., “A 512-kb flash EEPROM embedded in a 32-b microcontroller,”in IEEE Journal of Solid-State Circuits, vol. 27, Issue 4, pp. 574-582, Apr. 1992.
[10] S. H. Kulkarni, et al., “A 4 kb Metal-Fuse OTP-ROM Macro Featuring a 2 V Programmable 1.37 μm2 1T1R Bit Cell in 32 nm High-k Metal-Gate CMOS,”in IEEE Journal of Solid-State Circuits, vol. 45, Issue 4, pp. 863-868, Apr. 2010.
[11] Y. H. Tsai, et al., “45nm Gateless Anti-Fuse Cell with CMOS Fully Compatible Process,”in IEEE International Electron Devices Meeting Digest of Technical Papers, pp. 95-98, Dec. 2007.
[12] S. L. Min, et al., “Current trends in flash memory technology,”in IEEE Asia and South Pacific Conference on Design Automation, pp. 24-27, Jan. 2006.
[13] F. Masuoka, et al., "New ultra high density EPROM and flash EEPROM with NAND structure cell,"in IEEE International Electron Devices Meeting Digest of Technical Papers, vol. 33, pp. 552-555, 1987.
[14] A. Bergemont, et al., "NOR virtual ground (NVG)-a new scaling concept for very high density flash EEPROM and its implementation in a 0.5 um process,"in IEEE International Electron Devices Meeting Digest of Technical Papers, pp. 15-18, Dec. 1993.
[15] J. Zhang, et al., "A machine-learning classifier implemented in a standard 6T SRAM array,"in IEEE IEEE Symposium on VLSI Circuits, pp. 1-2, 2016.
[16] G. Srinivasan, et al., "Significance driven hybrid 8T-6T SRAM for energy-efficient synaptic storage in artificial neural networks,"in Design, Automation & Test in Europe Conference & Exhibition, pp. 151-159, 2016.
[17] J. Lee, et al., "A 17.5 fJ/bit energy-efficient analog SRAM for mixed-signal processing,"in IEEE International Symposium on Circuits and Systems, pp. 1010-1013, 2016.
[18] J. S. Seo, et al., "A 45nm CMOS neuromorphic chip with a scalable architecture for learning in networks of spiking neurons,"in IEEE Custom Integrated Circuits Conference, pp. 1-4, 2011.
[19] K. Bong, et al., "A 0.62mW ultra-low-power convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector,"in IEEE International Solid-State Circuits Conference, pp. 248-249, 2017.
[20] M. Cooperman, et al.,"in IEEE Transactions on Circuits and Systems, pp. 438-442, 1988.
[21] G. Otomo, et al., "Special memory and embedded memory macros in MPEG environment,"in Proceedings of the IEEE 1995 Custom Integrated Circuits Conference, pp. 139-142, 1995.
[22] E. Seevinck, et al., "Static-noise margin analysis of MOS SRAM cells," in IEEE J. Solid-State Circuits, vol. 22, pp. 748-754, Oct. 1987.
[23] A. Agarwal, et al., "A 320mV-to-1.2V On-Die Fine-Grained Reconfigurable Fabric for DSP/Media Accelerators in 32nm CMOS,"in IEEE International Solid-State Circuits Conference, pp. 328-329, Feb. 2010.
[24] M. Wieckowski, et al., "A portless SRAM Cell using stunted wordline drivers," in IEEE International Symposium on Circuits and Systems, pp. 584-587, 2008.
[25] M. Wieckowski, et al., "Portless SRAM-A High-Performance Alternative to the 6T Methodology,"in IEEE J. Solid-State Circuits, vol. 42, pp. 2600-2610, Nov. 2007.
[26] K. Nii, et al., "A 45-nm single-port and dual-port SRAM family with robust read/write stabilizing circuitry under DVFS environment,"in IEEE Symposium on VLSI Circuits, pp. 212-213, 2008.
[27] D. P. Wang, et al., "A 45nm dual-port SRAM with write and read capability enhancement at low voltage," in International SoC Design Conference, pp. 211-214, 2007.
[28] S. A. Tawfik, et al., "Low power and robust 7T dual-Vt SRAM circuit," in IEEE International Symposium on Circuits and Systems, pp. 1452-1455, 2008.
[29] J. Singh, et al., "Single ended 6T SRAM with isolated read-port for low-power embedded systems," in Design, Automation & Test in Europe Conference & Exhibition, pp. 917-922, 2009.
[30] N. Verma, et al., "A 256 kb 65 nm 8T Subthreshold SRAM Employing Sense-Amplifier Redundancy," in IEEE J. Solid-State Circuits, vol. 43, pp. 141-149, Jan. 2008.
[31] Y. Morita, et al., "An Area-Conscious Low-Voltage-Oriented 8T-SRAM Design under DVS Environment," in IEEE Symposium on VLSI Circuits, pp. 256-257, 2007.
[32] T. Song, et al., "A 10nm FinFET 128Mb SRAM with assist adjustment system for power, performance, and area optimization," in IEEE International Solid-State Circuits Conference, pp. 306-307, 2016.
[33] Y. H. Chen, et al., "A 16 nm 128 Mb SRAM in High- kappa Metal-Gate FinFET Technology With Write-Assist Circuitry for Low-VMIN Applications," in IEEE Journal of Solid-State Circuits, vol. 50, Issue 1, pp. 170-177, 2015.
[34] E. Karl, et al., "A 4.6GHz 162Mb SRAM design in 22nm tri-gate CMOS technology with integrated active VMIN-enhancing assist circuitry," in IEEE International Solid-State Circuits Conference, pp. 230-232, 2012.
[35] J. Chang, et al., "A 20nm 112Mb SRAM in High-κ Metal-Gate with Assist Circuitry for Low-Leakage and Low-VMIN Applications," in IEEE International Solid-State Circuits Conference, pp. 316-318, 2013.
[36] T. Song, et al., "A 14nm FinFET 128Mb 6T SRAM with VMIN-enhancement techniques for low-power applications," in IEEE International Solid-State Circuits Conference, pp. 232-233, 2014.
[37] M. Yabuuchi, et al., "20nm High-density single-port and dual-port SRAMs with wordline-voltage-adjustment system for read/write assists," in IEEE International Solid-State Circuits Conference, pp. 234-235, 2014.
[38] K. S. Kim, et al., "Orthogonal transpose-RAM cell array architecture with alternate bit-line to bit-line contact scheme," in Proceedings 2001 IEEE International Workshop on Memory Technology, Design and Testing, pp. 9-11, 2001.
[39] K. G. Revathi, et al., "Efficient diagonal data mapping for large size 2D DCT/IDCT using single port SRAM based transpose memory," in 2016 International Conference on Electrical, Electronics, and Optimization Techniques, pp. 4894-4898, 2016.
[40] Z. Xie, et al., "Data mapping scheme and implementation for high-throughput DCT/IDCT transpose memory," in 2014 12th IEEE International Conference on Solid-State and Integrated Circuit Technology, pp. 1-3, 2014.
[41] Q. Shang, et al., "Single-Port SRAM-Based Transpose Memory With Diagonal Data Mapping for Large Size 2-D DCT/IDCT," in IEEE Transactions on Very Large Scale Integration Systems, pp. 2423-2427, 2014.