以可分離濾波器增進卷積神經網路的效能｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	康浩平 Kang, Hao-Ping
論文名稱：	以可分離濾波器增進卷積神經網路的效能 Improving Convolutional Neural Networks by Separable Filters
指導教授：	李哲榮 Lee, Che-Rung
口試委員:	周志遠劉炳傳
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2014
畢業學年度：	102
語文別：	英文
論文頁數：	40
中文關鍵詞：	通用圖形處理器、卷積神經網路、深度學習、可分離濾波器
外文關鍵詞：	GPU, convolutional neural networks, deep learning, separable filters
相關次數：	點閱：82 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

卷積神經網路是深度學習架構的一種,因為它對於圖形辨識有很好的效果,
所以目前研究界將焦點放在它身上, 但是它的訓練過程極為緩慢,即使是擁有
強大計算能力的GPU也需要幾天的時間才能訓練完成, 這樣便限制了它的應用
層面。
在這篇論文中,我們提出以可分離濾波器來提高卷積神經網路的速度, 首
先,在卷積神經網路中的二維濾波器用SVD來分解近似,並得到兩個一維的濾
波器, 其次,用這兩個一維濾波器來進行一維卷積並代替原本的二維卷積, 這
樣可以有效減少計算量。在GPU實作中,我們實作了一個批次的SVD,它可以
同時處理很多的小矩陣SVD, 此外,我們提出三種不同的方法來計算卷積, 這
些方法會根據濾波器的大小來使用不同的記憶體,以便提高計算效率。
結果顯示,在前向傳導和後向傳導中,我們的可以得到1.38x ∼ 2.66x的加
速, 而以整體的訓練速度來看,我們得到13%的速度提升,但是準確度會下
降1%。

Convolutional neural networks are one of the most widely used deep architectures
in machine learning. While they achieve superior performance of recognition especially
for images, the training remains a computational challenge which prevents them from
practical uses. Even for GPUs possessing great computational power might take days
to produce results.
In this thesis, we propose a method based on separable filters to reduce the train-
ing time. First, by using SVDs, the 2D filters in the convolutional neural networks are
approximated by the product of two 1D filters. Second, two 1D convolutions are per-
formed with the previous 1D filters. In our GPU implementation, a batched SVDs that
can compute multiple small matrices simultaneously, and 3 methods which use different
memory spaces according to the filter size are presented.
Our experiment results shown that 1.38x ∼ 2.66x speedup was achieved in the for-
ward and the backward pass. The overall training time could be reduced by 13% with
1% drop in the recognition accuracy.

Introduction 1
Preliminaries 3
1 Notation 3
2 Deep Learning 3
3 Convolutional Neural Networks 5
4 Separable Filters 8
5 Singular Value Decomposition 9
6 Related Work 10
Methodology 12
1 Filters Approximated by SVD 12
2 CNNs Training Based on Separable Filters 14
Implementation 16
1 SVD Approximation of Filters 16
2 Separable convolutions 17
2.1 Register-based Method 17
2.2 One-pass Method 18
2.3 Two-pass Method 20
2.4 Comparing With the Three Methods 22
Experiments 24
1 Evaluation of SVD Approximations 24
2 Evaluation of CNNs 25
2.1 Evaluation of Speed 25
2.2 Evaluation of Recognition Performance 26
Conclusion 29
A Appendix 30
A.1 Training of Neural Networks 30
A.2 Training of Convolutional Neural Networks 31
A.3 Proof of Separable Filter 33
A.4 Proof of Power Method 34
A.5 Proof of Algo. 2 37
References 38
                                

[1][2][3][4][5][6][7][8][9][10][11][12]Emmanuel Agullo et al. “Numerical linear algebra on emerging architectures:
The PLASMA and MAGMA projects”. Journal of Physics: Conference Series.
Vol. 180. 1. IOP Publishing. 2009, p. 012037.
M. Galassi et al. GNU Scientific Library Reference Manual. Third. ISBN : 0954612078.
E. Anderson et al. LAPACK Users’ Guide. Third. Philadelphia, PA: Society for
Industrial and Applied Mathematics, 1999. ISBN : 0-89871-447-8 (paperback).
Yoshua Bengio. “Learning deep architectures for AI”. Foundations and trends R
in Machine Learning 2.1 (2009), pp. 1–127.
Yoshua Bengio. “Practical recommendations for gradient-based training of deep
architectures”. Neural Networks: Tricks of the Trade. Springer, 2012, pp. 437–
478.
Yoshua Bengio, Aaron Courville, and Pascal Vincent. “Representation learning:
A review and new perspectives”. Pattern Analysis and Machine Intelligence,
IEEE Transactions on 35.8 (2013), pp. 1798–1828.
Christopher M Bishop et al. Pattern recognition and machine learning. Vol. 1.
springer New York, 2006.
Jake Bouvrie. “Notes on convolutional neural networks” (2006).
Circuit complexity. http://en.wikipedia.org/wiki/Circuit complexity/.
cuda-convnet. https://code.google.com/p/cuda-convnet/.
Navneet Dalal and Bill Triggs. “Histograms of oriented gradients for human de-
tection”. Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE
Computer Society Conference on. Vol. 1. IEEE. 2005, pp. 886–893.
Carl Eckart and Gale Young. “The approximation of one matrix by another of
lower rank”. Psychometrika 1.3 (1936), pp. 211–218.
[13][14][15][16][17][18][19][20][21][22][23]Merrick Furst, James B Saxe, and Michael Sipser. “Parity, circuits, and the polynomial-
time hierarchy”. Mathematical Systems Theory 17.1 (1984), pp. 13–27.
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier net-
works”. Proceedings of the 14th International Conference on Artificial Intelli-
gence and Statistics. JMLR W&CP Volume. Vol. 15. 2011, pp. 315–323.
Gene H Golub and Charles F Van Loan. Matrix computations. Vol. 3. JHU Press,
2012.
Daniela Hernandez. “Chinese Google” Unveils Visual Search Engine Powered
by Fake Brains. http://www.wired.com/2013/06/baidu-virtual-search/.
2013.
Geoffrey Hinton, Simon Osindero, and Yee-Whye Teh. “A fast learning algo-
rithm for deep belief nets”. Neural computation 18.7 (2006), pp. 1527–1554.
John R Humphrey et al. “CULA: hybrid GPU accelerated linear algebra rou-
tines”. SPIE Defense, Security, and Sensing. International Society for Optics and
Photonics. 2010, pp. 770502–770502.
Yangqing Jia. Caffe: An Open Source Convolutional Architecture for Fast Fea-
ture Embedding. http://caffe.berkeleyvision.org/. 2013.
Alex Krizhevsky and Geoffrey Hinton. “Learning multiple layers of features
from tiny images”. Computer Science Department, University of Toronto, Tech.
Rep (2009).
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. “Imagenet classifica-
tion with deep convolutional neural networks”. Advances in neural information
processing systems. 2012, pp. 1097–1105.
Quoc V Le. “Building high-level features using large scale unsupervised learn-
ing”. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE Interna-
tional Conference on. IEEE. 2013, pp. 8595–8598.
Yann LeCun et al. “Backpropagation applied to handwritten zip code recogni-
tion”. Neural computation 1.4 (1989), pp. 541–551.
[24][25][26][27][28][29][30][31][32][33]Yann LeCun et al. “Gradient-based learning applied to document recognition”.
Proceedings of the IEEE 86.11 (1998), pp. 2278–2324.
David G Lowe. “Object recognition from local scale-invariant features”. Com-
puter vision, 1999. The proceedings of the seventh IEEE international conference
on. Vol. 2. Ieee. 1999, pp. 1150–1157.
Franck Mamalet, Christophe Garcia, et al. “Real-time video convolutional face
finder on embedded platforms”. EURASIP Journal on Embedded Systems 2007
(2007).
Michael Mathieu, Mikael Henaff, and Yann LeCun. “Fast Training of Convolu-
tional Networks through FFTs”. arXiv preprint arXiv:1312.5851 (2013).
Jawad Nagi et al. “Convolutional Neural Support Vector Machines: Hybrid Vi-
sual Pattern Classifiers for Multi-robot Systems”. Machine Learning and Appli-
cations (ICMLA), 2012 11th International Conference on. Vol. 1. IEEE. 2012,
pp. 27–32.
Steven W Smith et al. “The scientist and engineer’s guide to digital signal pro-
cessing” (1997).
Yaniv Taigman et al. “Deep-Face: Closing the Gap to Human-Level Performance
in Face Verification”. IEEE CVPR. 2014.
Yichuan Tang. “Deep learning using linear support vector machines”. Workshop
on Challenges in Representation Learning, ICML. 2013.
Paul E Utgoff and David J Stracuzzi. “Many-layered learning”. Neural Compu-
tation 14.10 (2002), pp. 2497–2529.
Li Wan et al. “Regularization of neural networks using dropconnect”. Proceed-
ings of the 30th International Conference on Machine Learning (ICML-13). 2013,
pp. 1058–1066.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文