用於二值化神經網路推論的卷積結果共享方法

簡易檢索 / 詳目顯示

回結果列表

研究生：	張雅鈞 Chang, Ya-Chun
論文名稱：	用於二值化神經網路推論的卷積結果共享方法 A Convolutional Result Sharing Approach for Binarized Neural Network Inference
指導教授：	王俊堯 Wang, Chun-Yao
口試委員:	江介宏 Jiang, Jie-Hong 溫宏斌 Wen, Hung-Pin
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	30
中文關鍵詞：	卷積神經網路、二值化神經網路、近似運算
外文關鍵詞：	convolutional neural network, binarized neural network, approximate computing
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

二值化神經網路（Binarized neural network, BNN）能更有效的在移動平台上實現卷積神經網路（Convolutional Neural Network, CNN）。推論時，二值化神經網路中的乘法累加運算可以簡化成反互斥或閘-位元計數 (XNOR-popcount) 運算，互斥或閘-位元計數運算在二值化神經網路中占了大部分的運算。為了減少二值化神經網路卷積層中所需的運算量，我們將三維濾波器分解為二維濾波器，並利用重複濾波器、反向濾波器和類似濾波器來共享卷積結果。通過共享卷積結果的方式，可以有效減少二值化神經網路卷積層中的運算量。實驗結果顯示CIFAR-10和SVHN在二值化神經網路卷積層中的運算量可減少約60％，同時保持精度損失在原始訓練網路的1％內。

The binary-weight-binary-input binarized neural network (BNN) allows a much more efficient way to implement convolutional neural networks (CNNs) on mobile platforms.
During inference, the multiply-accumulate operations in BNNs can be reduced to XNOR-popcount operations.
Thus, the XNOR-popcount operations dominate most of the computation in BNNs.
To reduce the number of required operations in convolution layers of BNNs, we decompose 3-D filters into 2-D filters and exploit the repeated filters, inverse filters, and similar filters to share convolutional results.
By sharing the convolutional results, the number of operations in convolution layers of BNNs can be reduced effectively.
Experimental results show that the number of operations can be reduced by about 60\% for CIFAR-10 and SVHN on BNNs while keeping the accuracy loss within 1\% of originally trained networks.

中文摘要 i
Abstract ii
Acknowledgment iii
Contents iv
List of Tables vi
List of Figures vii
Introduction 1
Backgrounds 4
1 Convolutional Neural Networks 4
2 Binarized Neural Networks 5
Proposed Scheme 7
1 Convolutional Result Sharing with Filter Repetitions 7
1.1 2-D Filter ID 7
1.2 Inverse Bit 8
1.3 Decomposition from 3-D to 2-D 8
1.4 Filter Dependency Graph 10
1.5 Convolutional Result Sharing 12
1.6 Reduction on XNOR-Popcount Operations 13
2 Convolutional Result Sharing with Filter Approximation 14
2.1 Degree of Similarity of Filters 15
2.2 Degree of Importance of Filters 16
2.3 Filter Similarity Graph 16
2.4 Filter Approximation 17
2.5 ILP Formulation 20
3 Overall Flow 20
4Experimental Evaluation 23
1 Experimental Setup 23
2 Experimental Results 23
Conclusion 28
Bibliography 29
                                

[1] M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: training deep neural networks with binary weights during propagations,” in Proc. NIPS, pp. 3123–3131, 2015.
[2] M. Courbariaux, Y. Bengio, “BinaryNet: training deep neural networks with weights and activations constrained to +1 or -1,” ArXiv:1602.02830, 2016.
[3] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1," ArXiv:1602.02830, 2016.
[4] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: imageNet classification using binary convolutional neural networks,” in Proc. ECCV, pp. 525–542, 2016.
[5] P. Gysel, M. Motamedi, and S. Ghiasi, “Hardware-oriented approximation of convolutional neural networks,” ArXiv:1604.03168, 2016.
[6] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients,” ArXiv:1606.06160, 2016.
[7] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: training neural networks with low precision weights and activations,” ArXiv:1609.07061, 2016.
[8] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both weights and connections for efficient neural networks,” in Proc. NIPS, pp. 1135-1143, 2015.
[9] T.-J. Yang, Y.-H. Chen, and V. Sze, “Designing energy-efficient convolutional neural networks using energy-aware pruning,” in Proc. CVPR, 2017.
[10] S. Han, H. Mao, and W. J. Dally, "Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding," ArXiv:1510.00149, 2015.
[11] W. Chen, J. T. Wilson, S. Tyree,K. Q. Weinberger, and Y. Chen, “Compressing neural networks with the hashing trick,” in Proc. ICML, pp. 2285–2294, 2015.
[12] H. Kim, J. Sim, Y. Choi, L.-S. Kim, “A kernel decomposition architecture for binary-weight convolutional neural networks,” in Proc. DAC, p. 60, 2017.
[13] C.-C. Chi and J.-H. R. Jiang, "Logic synthesis of binarized neural networks for efficient circuit implementation," in Proc. ICCAD, pp 84:1–84:7, 2018.
[14] Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre, and K. Vissers, "FINN: A framework for fast, scalable binarized neural network inference," in Proc. Int. Symp. on Field-Programmable Gate Arrays, pp. 65--74, 2017.
[15] A. Krizhevsky, "Learning multiple layers of features from tiny images," MS thesis, University of Toronto, https://www.cs.toronto.edu/~kriz/cifar.html, 2009.
[16] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng, "Reading digits in natural images with unsupervised feature learning," NIPS Workshop on Deep Learning and Unsupervised Feature Learning, pp. 5, 2011.
[17] “keras,” https://keras.io/
[18] “GUROBI,” https://www.gurobi.com/.

簡易檢索 / 詳目顯示

相關論文