簡易檢索 / 詳目顯示

研究生: 陳君函
Chen, Chun-Han
論文名稱: 從能量層面探討卷積神經網絡及其架構
Mining Structures of Convolutional Neural Networks: An Energy Perspective
指導教授: 張世杰
Chang, Shih-Chieh
口試委員: 彭文志
Peng, Wen-Chih
黃稚存
Huang, Chih-Tsun
學位類別: 碩士
Master
系所名稱:
論文出版年: 2017
畢業學年度: 106
語文別: 英文
論文頁數: 37
中文關鍵詞: 深度學習卷積神經網絡能量消耗
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年來,卷積神經網路除了在計算機視覺領域中扮演著相當重要的角色,也被廣泛的應用在圖像辨識上面。因此,卷積神經網路的運算複雜度和能量消耗量已然成為一個重要的議題,特別是應用在嵌入式系統,或者其他電池供電的行動設備。除了減少網絡之間的運算複雜度之外,如果我們能夠在訓練或者測試階段之前,預先估計給定的卷積神經網路的能量消耗量,我們就能夠知道此卷積神經網路是否適合應用在行動裝置。因此,藉由這個動機,我們提出了一個可以有效地預估卷積神經網路的能量消耗量的模型。在我們的論文中,一開始,我們先詳細地分析不同卷積神經網路內部參數與核心程式之間的關係,接下來根據我們觀察的結果,提出了一個可以有效地在訓練或者測試階段之前,預估卷積神經網路的能量消耗量的模型。我們使用CIFAR-10 資料集作為我們的實驗數據,並且在Caffe 上執行實驗,最後我們的方法在預測卷積神經網路的能量消耗量達到了平均只有14.41%的誤差率。


    Recently convolutional neural networks (CNNs) have drawn much attentions and been widely applied on image recognition; therefore, the complexity of computation and energy consumption have become a big issue for deploying CNNs, especially on embedded systems or other battery-powered mobile devices. Apart from reducing the complexity of network computations, if we could estimate the energy consumptions of the given network configuration before train or test phases, we would realize that whether the CNNs can be deployed on mobile devices or not. As the result, we propose a predictive energy model to effectively predict the energy consumption of a CNN. In this work, first we analyze the relation between different network configurations and kernel functions operations reported by NVIDIA profiler tool in detail, and then based on the analysis, we propose a predictive energy model that could calculate an estimated energy consumption as we have the architecture of a convolutional neural network before test phases. The experiments are processed with CIFAR-10 dataset and are implemented in Caffe, and the overall error rate of our methodology for predicting energy consumption is 14.41%.

    1 Introduction 6 2 Problem Definition 10 2.1 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Overall Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Analysis 15 3.1 Kernel Functions Classification . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Fixed Number . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.2 Linear Number . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1.3 Approximate Number . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4 Conditional Number . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Operation and Runtime Analysis . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Predictive Energy Model . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Methodology 26 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 The Architecture: AlexNet . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 The Dataset: CIFAR-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4 NVIDIA Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.5 MATLAB Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . 29 5 Experiments 30 5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.3 Energy Consumption Analysis . . . . . . . . . . . . . . . . . . . . . . . 31 5.4 Networks Performance Analysis . . . . . . . . . . . . . . . . . . . . . . 32 6 Conclusion 35

    [1] S. H. Brown. Multiple linear regression analysis: a matrix approach with matlab. Alabama Journal of Mathematics, 34:1–3, 2009.
    [2] S. Changpinyo, M. Sandler, and A. Zhmoginov. The power of sparsity in convolutional neural networks. arXiv preprint arXiv:1702.06257, 2017.
    [3] D. C. Ciresan, U. Meier, J. Masci, L. Maria Gambardella, and J. Schmidhuber. Flexible, high performance convolutional neural networks for image classification. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, volume 22, page 1237. Barcelona, Spain, 2011.
    [4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A largescale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248–255. IEEE, 2009.
    [5] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural
    networks with pruning, trained quantization and huffman coding. arXiv preprint
    arXiv:1510.00149, 2015.
    [6] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
    [7] A. Karpathy. Cs231n: Convolutional neural networks for visual recognition. Neural networks, 1, 2016.
    [8] K. Kasichayanula, D. Terpstra, P. Luszczek, S. Tomov, S. Moore, and G. D. Peterson. Power aware computing on gpus. In Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on, pages 64–73. IEEE, 2012.
    [9] A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. 2009.
    [10] A. Krizhevsky, V. Nair, and G. Hinton. The cifar-10 dataset. online: http://www. cs.toronto. edu/kriz/cifar. html, 2014.
    [11] Y. LeCun, Y. Bengio, et al. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995.
    [12] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
    [13] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, 2015.
    [14] T.-J. Yang, Y.-H. Chen, and V. Sze. Designing energy-efficient convolutional neural networks using energy-aware pruning. arXiv preprint arXiv:1611.05128, 2016.

    QR CODE