基於效率化ETH-CNN技術來快速切割CU的方法｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	鐘珮瑀 Zhong, Pei-Yu
論文名稱：	基於效率化ETH-CNN技術來快速切割CU的方法 Fast HEVC CU partition based on lightened ETH-CNN and Rich CPH datasets
指導教授：	王家祥 Wang, Jia-Shung
口試委員:	蕭旭峰杭學鳴彭文孝
學位類別：	碩士 Master
系所名稱：
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	30
中文關鍵詞：	高效率視訊編碼、編碼樹分割、卷積神經網路
外文關鍵詞：	Network-in-Network, Maxout Networks
相關次數：	點閱：93 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

H.265 (或稱HEVC)是2013 年ISO通過的的視訊與影像壓縮技術標準。 H.265 壓縮的概念是將一張畫面切成數個Coding Unit(CU)以z-scan 的方式依序作壓縮；在H.265 的標準下，提出了Coding Tree 的概念。Coding Tree 是以Coding Unit, Predict Unit, Transform Unit 所組成，為了提高壓縮後影像的品質，將64x64 的CU 再細分成32x32, 16x16, 8x8 總共四層。除此之外，H.265 也提出多項新的機制，目的是增加壓縮的影像品質和降低儲存所需要的位元；但伴隨而來的是複雜的運算以及大量的運算時間需求。
由於其中的CTU partition對HEVC中的編碼複雜度影響最大，因此大多的方法會通過簡化CTU的處理來降低複雜度，本論文研究探討運用Convolutional Neural Network(CNN)技術來獲取有效率的加速效果，達到編碼速度的提升。
在以前的一些方法中，必須手動提取一些特徵，如RD成本，量化參數（QP）和紋理複雜度來預測CTU partition，這些特徵依賴與CTU partition結果之間關係的先前知識，因此本論文藉由改進ETH-CNN獲取特徵以及利用有大量數據的CPH-Intra database，可以自動學習CTU partition結構的特徵來獲取有效率的加速效果。

HEVC is intended to provide significantly better coding efficiency than H.264/ AVC and its predecessors, but it increase the expense of extremely high encoding complexity. In particular, in HEVC, a quad-tree partition of the coding unit (CU) which is consumes a large proportion of the encoding complexity, lead to the exhaustively search for the best rate-distortion optimization (RDO) partition. In [1], a deep learning approach (convolutional neural network, ETH-CNN) to predict the CU partition for reducing the HEVC complexity at intra-modes was proposed. Their CU partition scheme is considered for resolving the entire coding tree unit instantaneously instead of one level at a time. Thus, a large-scale training dataset including substantial CU partition data is necessary for solving this complicated problem.
In this thesis, a lightened ETH-CNN, which augmenting ETH-CNN model through some useful CNN ideas, such as Network-in-Network [2], Maxout Networks [3], Batch Normalization [4] etc. to improve the prediction accuracy plus reduce the computational complexity as well. The experimental results demonstrate that the lightened approach provides an increase accuracy (64x64 to 32x32) in the CU partition prediction, a decrease (in QP22) in the CU partition time.

中文摘要    I
ABSTRACT-------    II
CONTENTS-------    IV
LIST OF FIGURES-------    VI
LIST OF TABLES-------    VII
Chapter 1.  Introduction-------    1
Chapter 2.  Related Works------- 3
2.1  Overview of CU Partition------- 3
2.2  ETH-CNN and CPH-Intra Database------- 4
2.2.1  CPH-Intra Database------- 4
2.2.2    ETH-CNN-------    5
2.3  Convolutional Neural Network------- 9
2.3.1    Network in Network------- 9
2.3.2    Maxout Network------- 12
2.3.3    Batch Normalization------- 13
Chapter 3.  Proposed Methods------- 15
3.1  Augmenting ETH-CNN------- 15
3.2  Xavier Initialization------- 16
3.3  Network in Network Implementation------- 17
3.4  Reduce Parameters in Concatenating Layer------- 18
3.5  Maxout in Fully Connected Layer------- 19
Chapter 4.  Simulation and Experimental Results-------    20
4.1  Configuration and Parameters Settings------- 20
4.2  Maxout: in different Layer-------    21
4.3  Network in Network with Maxout------- 22
4.4  Results and Discussions------- 25
Chapter 5.  Conclusion------- 29
REFERENCES------- 30


                                

[1]. Mai Xu, Tianyi Li, Zulin Wang, Xin Deng, Ren Yang and Zhenyu Guan, “Reducing Complexity of HEVC: A Deep Learning Approach,” IEEE Transactions on Image Processing, Vol. 27, No. 10, pp. 5044-5059, Oct. 2018.
[2]. Min Lin, Qiang Chen, Shuicheng Yan, “Network in Network,” arXiv: 1312.4400, Mar 2014.
[3]. Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, and Yoshua Bengio, “Maxout Networks,” Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 Atlanta, USA, June 16 - 21, 2013.
[4]. Sergey Ioffe and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv: 1502.03167, Mar 2015.
[5]. Tianyi Li, Mai Xu and Xin Deng, “A Deep Convolutional Neural Network Approach for Complexity Reduction On Intra-Mode HEVC,” in 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, 10-14 July 2017, pp. 1255-1260.
[6]. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proc. AISTATS, 2010, pp. 249–256.
[7]. L. Zhu, Y. Zhang, Z. Pan, R. Wang, S. Kwong and Z. Peng, “Binary and multi-class learning based low complexity optimization for HEVC encoding,” IEEE Transactions on Broadcasting, pp. 1–15, Jun. 2017
[8]. A. Mohamed, G. Hinton, and G. Penn, “Understanding how deep belief networks perform acoustic modelling,” in Proc. ICASSP, 2012, pp. 4273–4276.
[9]. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich “Going deeper with convolutions,” The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9
[10]. JCT-VC, “HM Software,” [Online]. Available:https://hevc.hhi.fraunhofer.de/svn/svn HEVCSoftware/tags/HM-16.5/, 2014, [Accessed 5-Nov.-2016].
[11]. F. Bossen, “Common test conditions and software reference configurations,” Joint Collaborative Team on Video Coding, document Rec. JCTVC-L1100, Jan. 2013.
[12]. J.-R. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan, and T. Wiegand, “ Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC),” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1669–1684, Dec. 2012.
[13]. G. Bjøntegaard, “Calculation of avarage PSNR difference between RDcurves,” in ITU-T, VCEG-M33, Austin, TX, USA, Apr. 2001.
[14]. D. Liu, X. Liu and Y. Li, “Fast CU size decisions for HEVC intra frame coding based on support vector machines,” in 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing (DASC), 2016.

簡易檢索 / 詳目顯示

相關論文