研究生: |
魏聖修 Wei, Sheng-Hsiu |
---|---|
論文名稱: |
一個考慮到過濾器重複特性的二值化神經網路卷積結果靈活共享方法的研究 A Flexible Result Sharing Approach Using Filter Repetitions to Binarized Neural Networks Optimization |
指導教授: |
王俊堯
Wang, Chun-Yao |
口試委員: |
張世杰
Chang, Shih-Chieh 陳勇志 Chen, Yung-Chih |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 34 |
中文關鍵詞: | 二值化神經網路 、過濾器重複特性 、卷積結果共享方法 |
外文關鍵詞: | Binarized Neural Networks, Filter Repetitions, Convolutional Result Sharing |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
卷積式神經網路(CNN)在處理計算機視覺和人工智能(AI)等領域的問題時可以達到出色的準確率。而其具有二進制權重和二進制輸入特徵的二進制神經網路(BNN)可以在邊緣裝置上更有效率地實現。在BNN的推論模型中,原始複雜的乘法和累加(MAC)運算可以被簡化為XNOR-Popcount運算。由於過濾器中的二進制權重的特性,一個完整的過濾器可以分解為多個部分過濾器,並且這些部分過濾器的重複特性可以藉由利用結果共享的方法來減少所需的XNOR-Popcount運算量。
因此,在這篇論文中,我們提出了一種靈活的卷積結果共享方法。這個方法可 以在部分過濾器之間重複利用和共享儲存的計算結果。我們還在 FPGA 平台上實現了所提出的方法。實驗結果顯示,相較於原始的方法,使用我們的方法,在硬體實現上能夠使所需要的XNOR-Popcount運算量減少到原本的1%以下,並且在各個不同的神經網路層上所需的LUT數量可以減少到原本的50.6%到23.6%不等。
Convolutional Neural Networks (CNNs) can provide excellent accuracy especially in fields, such as computer vision and artificial intelligence (AI), and its variant with binary weights and binary inputs, Binarized Neural Networks (BNNs), can be realized more efficiently on the edge. In the BNNs’ inference model, the original complex multiplication and accumulation (MAC) operations can be simplified to XNOR-Popcount operations. Because of the binarized weights in the filters, a complete filter can be decomposed into several partial filters, and the repetitions of partial filters can reduce the number of required XNOR-Popcount operations by exploiting result sharing.
Thus, in this work, we propose a flexible result sharing approach to reuse the computed results among partial filters. We also implemented the proposed approach on an FPGA platform. The experimental results show that the number of required XNOR-Popcount operations can be reduced to less than 1%, and the required number of LUTs is reduced to 50.6% to 23.6% on different layers as compared to the implementation without using the proposed approach.
1. Dario Amodei, et al., “Deep speech 2: End-to-End Speech Recognition in English and Mandarin,” in International conference on machine learning, PMLR, 2016, pp. 173-182.
2. Sajid Anwar, et al., “Structured Pruning of Deep Convolutional Neural Networks,” in ACM Journal on Emerging Technologies in Computing Systems (JETC), 2017, vol. 13, no. 3, pp. 1-18.
3. Chia-Chih Chi and Jie-Hong R Jiang, “Logic Synthesis of Binarized Neural Networks for Efficient Circuit Implementation,” in Proc. of ICCAD, 2018.
4. Matthieu Courbariaux, et al., “Binaryconnect: Training Deep Neural Networks with Binary Weights during Propagations,” arXiv preprint arXiv:1511.00363, 2015.
5. Matthieu Courbariaux, et al., “Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1,” arXiv preprint arXiv:1602.02830, 2016.
6. Ya-Chun Chang, et al., “A Convolutional Result Sharing Approach for Binarized Neural Network Inference,” in Proc. of DATE, 2020, pp. 780-785.
7. Itay Hubara, et al., “Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations,” JMLR, 2015, vol. 18, no. 1, pp. 6869-6898.
8. Song Han, et al., “Deep compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
9. Forrest N Iandola, et al., “SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size,” arXiv preprint arXiv:1602.07360, 2016.
10. Sergey Ioffe and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proc. of ICML, 2018, pp. 448-456.
11. Alex Krizhevsky and Geoffrey Hinton, “Learning Multiple Layers of Features from Tiny Images,” Citeseer, 2009.
12. Alex Krizhevsky, et al., “Imagenet Classification with Deep Convolutional Neural Networks,” in Advances in neural information processing systems, 2012, vol. 25, pp. 1097-1105.
13. Yann LeCun, ”The MNIST Database of Handwritten Digits,” http://yann.lecun.com/exdb/mnist/, 1998.
14. Chigozie Nwankpa, et al., “Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, ” arXiv preprint arXiv:1811.03378, 2018.
15. Sridhar Narayan, “The Generalized Sigmoid Activation Function: Competitive Supervised Learning,” Information Sciences, 1997, vol. 99, no. 1-2, pp. 69-82.
16. Jiantao Qiu, et al., “Going Deeper with Embedded FPGA Platform for Convolutional Neural Network,” in Proc. of the FPGA, 2016, pp. 26-35.
17. Mohammad Rastegari, et al., “Xnor-net: Imagenet Classification Using Binary Convolutional Neural Networks,” in Proc. of ECCV, Springer, 2016, pp. 525- 542.
18. Yaman Umuroglu, et al., “Finn: A Framework for Fast, Scalable Binarized Neural Network Inference,” in Proc. of FPGA, 2017, pp. 65-74.
19. Bing Xu, et al., “Empirical Evaluation of Rectified Activations in Convolutional Network,” arXiv preprint arXiv:1505.00853, 2015.
20. Tien-Ju Yang, et al., “Designing Energy-efficient Convolutional Neural Networks Using Energy-aware Pruning,” in Proc. of CVPR, 2017, pp. 5687-5695.
21. Shuchang Zhou, et al., “Dorefa-net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradient,” arXiv preprint arXiv:1606.06160, 2016.
22. Ye Zhang and Byron Wallace, “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification,” arXiv preprint arXiv:1510.03820, 2015.