研究生: |
黃秉立 Huang, Ping-Li |
---|---|
論文名稱: |
減少晶片上記憶體數據流和資料稀疏運算的突波捲積神經網路加速器 A Spike-Based Convolution Neural Network (SCNN)Accelerator with Reduced On-Chip Memory Data Flow and Data sparse operation |
指導教授: |
鄭桂忠
Tang, Kea-Tiong |
口試委員: |
呂仁碩
LIU, REN-SHUO 盧峙丞 Lu, Chih-Cheng |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 54 |
中文關鍵詞: | 突波神經網路 、捲積神經網路 、加速器 、數位電路 、稀疏性應用 |
外文關鍵詞: | CNN, SNN, Accelerators, Digital Circuits, Sparsity |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
人工智能網路的興起是因其有許多的應用,例如:圖像辨識、語言辨識等,為了能在邊緣設備上實現則需要更高的能源效率,來因應裝置的受限。 突波神經網絡 (spiking neural networks, SNN) 被認為是潛在的候選者,因為它們的計算特性可以減少乘法運算。 它們只需要加法和移位運算即可進行計算。 將此方法應用於 CNN 網絡可以降低計算的功耗,即藉由加法器來實現累加的部分,而位移器則是來取代非線性的運算。 這種混合網絡稱為突波捲積神經網路 (Spiking-CNN, SCNN)。
但是,為了達到更好的運算速度,往往需要大容量的內存來儲存需要的權重以及特徵,這就需要一定的面積和功耗。 本文提供了一種SCNN的數據流,可以減少芯片上所需的內存大小,並提供另一種混合數據流來減少片上內存,並針對SCNN的高稀疏性設計零跳躍。 這種方法可以降低操作所需的功耗。 通過這些方式,減少了整體所需的片上存儲器,同時提高了能效。 在CIFA-10數據集的應用下達到了104.76TOPs/W。並且相較於其他發表之突波捲積神經網路加速器在相同的應用下所需要片上記憶體能達到最少的數量。
Abstract—The rise of artificial intelligence networks is due to their numerous applications, such as image recognition and speech recognition. However, achieving these applications on edge devices requires higher energy efficiency to accommodate device limitations. Spiking neural networks(SNN) are considered potential candidates because their computational characteristics can reduce multiplication operations. They only require addition and shifting operations for calculations. Applying this method to CNN networks can reduce computational power by implementing accumulation through adders and replacing nonlinear operations with shifters. This hybrid network is known as Spiking-CNN (SCNN).
However, to achieve better computational speed, large memory capacity is often required to store the necessary weights and features, which in turn requires a certain area and power consumption. This paper provides a data flow of SCNN that can reduce the memory size required on the chip and provides another mixed data flow to reduce on-chip memory and design zero-skipping for the high sparsity of SCNN. This method can reduce the power consumption required for the operation. In these ways, the overall required on-chip memory is reduced, and energy efficiency is also improved. It reaches 104.76TOPs/W under the application of the CIFA-10 dataset. Moreover, compared to other published spiking convolutional neural network accelerators in the same application, it requires the minimum amount of on-chip memory.
[1] McCulloch W.S. and Pitts W.. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 1943, 5(4): 115-133
[2] Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological review, 1958, 65(6): 386.
[3] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. nature, 1986, 323(6088): 533-536.
[4] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[5] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
[6] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[7] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[8] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[9] Pouyanfar S, Sadiq S, Yan Y, et al. A survey on deep learning: Algorithms, techniques, and applications[J]. ACM Computing Surveys (CSUR), 2018, 51(5): 1-36
[10] Han, Song, et al. “Learning both weights and connections for efficient neural network.” Advances in neural information processing systems. 2015.
[11] N. P. Jouppi et al., "In-datacenter performance analysis of a tensor processing unit," In ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pp. 1-12, 2017.
[12] Y.-H. Chen, et al., “Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks.” In JSSC, ISSCC Special Issue, Vol. 52, No. 1, pp. 127-138, 2017.
[13] V. Sze, T.-J. Yang, Y.-H. Chen, J. Emer, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey." In Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, December 2017.
[14] F. Akopyan et al., “TrueNorth: Design and tool flow of a 65 mW 1 million neuron programmable neurosynaptic chip,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 34, no. 10, pp. 1537-1557, 2015.
[15] M. Davies et al., “Loihi: A neuromorphic manycore processor with onchip learning,” IEEE Micro, vol. 38, no. 1, pp. 82-99, 2018.
[16] G. K. Chen, R. Kumar, H. E. Sumbul, P. C. Knag and R. K. Krishnamurthy, "A 4096-Neuron 1M-Synapse 3.8PJ/SOP Spiking Neural Network with On-Chip STDP Learning and Sparse Weights in 10NM FinFET CMOS," IEEE Symposium on VLSI Circuits, 2018
[17] Amirhossein Tavanaei, Masoud Ghodrati, Saeed Reza Kheradpisheh, Timothee Masquelier, Anthony S. Maida, “Deep Learning in Spiking Neural Networks,” ArXiv , 2017.
[18] B. Rueckauer, Y. Hu, I.-A. Lungu, M. Pfeiffer, and S.-C. Liu, “Conversion of continuous-valued deep networks to efficient event-driven networks for image classification,” Frontiers in Neuroscience, vol. 11, p. 682, 2017.
[19] Hong-Han Lien, Tian-Sheuan Chang, “Sparse Compressed Spiking Neural Network Accelerator for Object Detection,” ArXiv , 2022
[20] Shijie Cao, Lingxiao Ma, Wencong Xiao, Chen Zhang, Yunxin Liu, Lintao Zhang, Lanshun Nie, and Zhi Yang2, “SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity through Low-Bit Quantization” CVPR, 2019
[21] Po-Yao Chuang, Pai-Yu Tan, Cheng-Wen Wu, and Juin-Ming Lu, “A 90nm 103.14 tops/w binary-weight spiking neural network cmos asic for real-time object classification,” 2020 57th ACM/IEEE Design Automation Conference (DAC), 2020.
[22] Hong-Han Lien, Chung-Wei Hsu, and Tian-Sheuan Chang, " VSA: Reconfigurable Vectorwise Spiking Neural Network Accelerator" 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Korea
[23] R. Wang, C. S. Thakur, T. J. Hamilton, J. Tapson and A. van Schaik, "A stochastic approach to STDP," 2016 IEEE International Symposium on Circuits and Systems (ISCAS), 2016, pp. 2082-2085.
[24] Y. Zhong, X. Cui, Y. Kuang, K. Liu, Y. Wang and R. Huang, "A Spike-Event-Based Neuromorphic Processor with Enhanced On-Chip STDP Learning in 28nm CMOS," 2021 IEEE International Symposium on Circuits and Systems (ISCAS), 2021, pp. 1–5.
[25] Sangyeob Kim, Sangjin Kim, Soyeon Um, Soyeon Kim,andHoi-Jun Yoo“Two-Step Spike Encoding Scheme and Architecture for Highly Sparse Spiking-Neural-Network,”arxiv,2022
[26] Bo Wang, Jun Zhou, Weng-Fai Wong, and Li-Shiuan Peh, “Shenjing: a low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip” 2020 The Conference on Design, Automation and Test in Europe(DATE), 2020, P. 240–245.
[27] Burin Amornpaisannon, Zhixuan Zhang, Venkata Pavan Kumar Miriyala , Hong Qu , Yansong Chua , Trevor E. Carlson ,and Haizhou Li, “Rectified linear postsynaptic potential function for backpropagation in deep spiking neural networks,” IEEE Transactions on Neural Networks and Learning Systems ( Early Access ), 2021.
[28] S. Narayanan, K. Taht, R. Balasubramonian, E. Giacomin, and P.-E. Gaillardon, “Spinalflow: an architecture and dataflow tailored for spiking neural networks,” in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020, pp. 349– 362.