Temporal Transcoding from H.264/AVC to SVC with Hierarchical Bidirectional Prediction

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃彥傑 Huang, Yen-Chieh
論文名稱：	Temporal Transcoding from H.264/AVC to SVC with Hierarchical Bidirectional Prediction 從H.264/AVC到具雙向階層式預測之SVC的時間視訊轉碼
指導教授：	林嘉文 Lin, Chia-Wen 陳永昌 Chen, Yung-Chang
口試委員:	葉家宏 Yeh, Chia-Hung 林嘉文 Lin, Chia-Wen 陳永昌 Chen, Yung-Chang
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2011
畢業學年度：	99
語文別：	英文
論文頁數：	55
中文關鍵詞：	視訊轉碼、階層式雙向預測、時間轉碼
外文關鍵詞：	Video Trancoding, H.264/AVC, H.264/SVC, Hierarchical Bidirectional Prediction, Temporal Transcoding
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

可調式視訊編碼H.264/SVC是H.264/AVC所延伸的一個標準，由於它在編碼串流時使用了層間的概念，所以能分別或合併提供時間可調性、畫面可調性與品質可調性。由於SVC可調性的特性，能適時的調整串流以支援各種不同的終端裝置及各式各樣的網路通道。但 SVC可調性的特性，需在編碼端藉由探討層與層間的相依性才能提供此特性。因此現今普遍存在的H.264/AVC串流內容由於編碼端本質缺法此種特性，故無法提供可調性的功能。在過去的研究裡，大部分的技術多在探討MPEG-2與H.264/AVC間的視訊轉碼，較少在討論H.264/AVC到SVC的視訊轉碼。但，對於廣播業者與視訊內容提供者，此種可調性的功能對它們的資源節省是有相當大的幫助，因此他們希望編碼出的視訊串流具有此功能。
　在這篇論文，我們提出了從H.264/AVC到具雙向預測之H.264/SVC的時間視訊轉碼。轉碼器首先對輸入的H.264/AVC串流解碼。而我們解碼輸入的H.264/AVC串流能得到宏塊編碼的模式以及宏塊的位移向量，再透過適當的演算法調整這些資訊進而能加速SVC編碼的速度。其中我們提出了宏塊決策演算法，此演算法能減少候選的宏塊模式近而加速編碼的過程。亦提出了宏塊位移向量演算法，基於充分利用輸入的H.264/AVC串流資訊，我們能加速位移向量估測的過程。而實驗結果證實了我們能大大的降低編碼的複雜度，同時保有接近理想的視訊品質。

The scalable extension (SVC) of H.264/AVC uses a notion of layers within the encoded bitstream for providing temporal, spatial and quality scalability, separately or combined. This scalability allows adaptation depending on the scenarios with different devices and heterogeneous networks. The SVC design requires scalability to be provided at the encoder side by exploiting inter-layer dependencies during encoding. This implies that existing H.264/AVC content cannot benefit from the scalability tools in SVC due to the lack of intrinsic scalability provided in the bitstream at encoding time. Since a lot of technical and financial effort is currently being spent on the migration from MPEG-2 equipment to H.264/AVC, it is unlikely that a new migration to SVC will occur in the short term. Due to broadcaster and content distributors want to have scalable bitstreams at their disposal, efficient technique for migration of single-layer content to a scalable format are desirable.
In this thesis, an approach for temporal transcoding from H.264/AVC to SVC with hierarchical bidirectional prediction is discussed. The input H.264/AVC bitstream is fully decoded by the transcoder. Macroblock coding mode and motion vectors are extracted from the input and adjusted to encode the output bitstream. The mode decision algorithm is proposed to reduce the candidate coding modes and the motion vector decision algorithm is proposed to obtain the output motion vector based on the input motion vector. As a result, a significant decrease in computational complexity is achieved, while maintaining a close to optimum compression efficiency.

Table of Contents
Abstract    i
Table of Contents    ii
List of Figures    iv
List of Tables    vii

Chapter 1: Introduction    1
1.1 Overview of Scalable Video Coding 1
1.2 Motivation    3
1.3 Thesis organization    5

Chapter 2: Related Works    6
2.1 Quality-SNR Scalability    6
2.2 Spatial Scalability    7
2.3 Temporal Scalability    7
2.3.1 Temporal Scalability in H.264/SVC    7
2.3.2 Different Approaches and Architectures of Video Transcoding from
AVC-to-SVC with Temporal Scalability    9

Chapter 3: Temporal Transcoding from H.264/AVC to SVC with Hierarchical Bidirectional Prediction    17
3.1 Input and Output GOP Structures    17
3.2 H.264/SVC Encoder Analysis    18
3.3 Mode Decision    21
3.3.1    Mode Decision of Top Layer of Hierarchical B Structure 22
3.3.2    Mode Decision of Non-top Layer of Hierarchical B Structure 26
3.4 Motion Vector Decision    29
3.4.1    Forward Motion Vectors Decision    31
3.4.2    Backward Motion Vectors Decision    34
3.4.3    Variable Block Size of H.264/AVC    37
3.5 Overall Architecture of the Proposed Transcoder    39

Chapter 4: Simulation Results    40
4.1 Software Simulation Environment    40
4.2 The Results of Simulation    41

Chapter 5: Conclusions    52
5.1 Conclusions    52
References    54







List of Figures

Fig 1.1: Video adaptation operation    3
Fig 1.2: Hierarchical prediction structures for enabling temporal scalability. (a) Hierarchical B-picture structure. (b) Nondyadic hierarchical structure. (c) Zero-delay structure    3
Fig 1.3: (a) H.264/AVC encoder structure. (b) H.264/SVC encoder structure.    5
Fig 1.4: Decoding and reencoding    5
Fig 1.5: Information reuse    5
Fig 2.1: Hierarchical prediction structures for enabling temporal scalability. (a) Hierarchical B-picture structure. (b) Nondyadic hierarchical structure. (c) Zero-delay structure    8
Fig 2.2: Hierarchical B prediction structure with three temporal layers (TL)    9
Fig 2.3: AVC to SVC Transcoder architecture in [9]    10
Fig 2.4: AVC to SVC simplified transcoder architecture in [9]    11
Fig 2.5: AVC to SVC Transcoder architecture in [10]    12
Fig 2.6: Input AVC bitstream GOP structure and output SVC bitstream GOP structure in [10]    12
Fig 2.7: The algorithm of motion vectors mapping in [10]    13
Fig 2.8: The proposed transcoder in [11]    14
Fig 2.9: Reduced search range in [11]    14
Fig 2.10: MB partition of H.264/AVC (left) and H.264/SVC (right) in [11]    15
Fig 2.11: MB in H.264/AVC with its MVs and the matching MB in SVC with its corresponding MVs in [11]    15
Fig 3.1: An example of the input AVC bitstream structure and the output SVC bitstream structure of GOP = 4    18
Fig 3.2: Motion estimation process flow    20
Fig 3.3: Top layers of hierarchical B structure which are marked by a circle and non-top layers of hierarchical B structure which are marked by a rectangular    22
Fig 3.4: Example of mode decision    23
Fig 3.5: The statistics of the mode decision in AVC encoder corresponds to the percentages of modes of SVC encoder: Top layer case    24
Fig 3.6: The flow chart of mode decision of top layer of hierarchical B structure    26
Fig 3.7: The statistics of the mode decision in AVC encoder corresponds to the percentages of modes of SVC encoder: Non-top layer case    28
Fig 3.8: The flow chart of mode decision of non-top layer of hierarchical B structure    29
Fig 3.9: Motion Vector Decision based on the input AVC motion vectors    30
Fig 3.10: The Motion vector composition    31
Fig 3.11: The motion vector composition by the algorithm of FDVS    33
Fig 3.12: The cases of forward prediction    34
Fig 3.13: Backward motion vector based on the forward prediction    35
Fig 3.14: The example of the general BMPA    36
Fig 3.15: The example of multiple motion vectors in a macroblock    37
Fig 3.16: The method of Partition Merge    38
Fig 3.17: The flow chart of the proposed transcoder    39
Fig 4.1: RD curve of the Soccer sequence in CIF resolution with GOP = 16    42
Fig 4.2: RD curve of the Stefan sequence in CIF resolution with GOP = 16    42
Fig 4.3: RD curve of the Football sequence in CIF resolution with GOP = 16    42
Fig 4.4: RD curve of the Hall sequence in CIF resolution with GOP = 16    43
Fig 4.5: RD curve of the Hall sequence in CIF resolution with GOP = 16    43
Fig 4.6: RD curve of the Foreman sequence in CIF resolution with GOP = 16    43
Fig 4.7: RD curve of the Crew sequence in CIF resolution with GOP = 16    44
Fig 4.8: RD curve of the Soccer sequence in CIF resolution with GOP = 8    45
Fig 4.9: RD curve of the Stefan sequence in CIF resolution with GOP = 8    46
Fig 4.10: RD curve of the Football sequence in CIF resolution with GOP = 8    46
Fig 4.11: RD curve of the Hall sequence in CIF resolution with GOP = 8    46
Fig 4.12: RD curve of the Harbour sequence in CIF resolution with GOP = 8    47
Fig 4.13: RD curve of the Foreman sequence in CIF resolution with GOP = 8    47
Fig 4.14: RD curve of the Crew sequence in CIF resolution with GOP = 8    47
Fig 4.15: RD curve of the Crew sequence in QCIF resolution with GOP = 8    49
Fig 4.16: RD curve of the Harbour sequence in QCIF resolution with GOP = 8    49
Fig 4.17: RD curve of the Soccer sequence in QCIF resolution with GOP = 8    49
Fig 4.18: RD curve of the Crew sequence in 4CIF resolution with GOP = 16    50
Fig 4.19: RD curve of the Harbour sequence in 4CIF resolution with GOP = 16    51
Fig 4.20: RD curve of the Soccer sequence in 4CIF resolution with GOP = 16    51







List of Tables

Table 3.1: The statistics of the mode decision (top layer)    25
Table 3.2: The statistics of the mode decision (non-top layer)    28
Table 4.1: The proposed method compared to the CPDT with exhaustive motion estimation(GOP = 16 and CIF resolution)    44
Table 4.2: The proposed method compared to the CPDT with fast motion estimation (GOP = 16 and CIF resolution)    44
Table 4.3: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 8 and CIF resolution)    48
Table 4.4: The proposed method compared to the CPDT with fast motion estimation (GOP = 8 and CIF resolution)    48
Table 4.5: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 8 and QCIF resolution)    50
Table 4.6: The proposed method compared to the CPDT with fast motion estimation (GOP = 8 and QCIF resolution)    50
Table 4.7: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 16 and 4CIF resolution)    51
Table 4.8: The proposed method compared to the CPDT with fast motion estimation (GOP = 16 and 4CIF resolution)    52

                                

[1] ITU-T and ISO/IEC JTC 1: Advanced Video Coding for Generic Audiovisual Services. ITU-T Rec. H.264/AVC and ISO/IEC 14496-10 (including SVC extension) , March 2009.
[2] S. –F Chang and A. Vetro,“Video Adaptation: Concept, Technologies, and Open Issues,”Procceeding of the IEEE, Vol 93, No. 1, p148 -158, Jan. 2005
[3] H. Schwarz, D. Marpe, and T. Wiegand,“Overview of the scalable video coding extension of the H.264/AVC Standard,”IEEE Trans. Circuits Syst. Video Technol. 17(9), 1103- 1120, Sep. 2007.
[4] J. Xin, C.-W Lin, and M.-T.Sun,“Digtal video transcoding,”Proceedings of the IEEE, vol.93, no.1, pp. 84- 97, 2005.
[5] H. Shen, S. Xiaoyan, F. Wu, H. Li and S. Li,“Transcoding to FGS Streams from H.264/AVC Hierarchcial B-Pictures,”IEEE Int. Conf. Image Proccessing, Atlanta, 2006.
[6] J. De Cock, S. Notebaert, P. Lambert and R. Van de Walle,“Architectures of Fast Transcoding of H.264/AVC to Quality-Scalable SVC Streams,”IEEE Transactions on Multimedia vol. 11 n. 7, pp. 1209 -1224, 2009.
[7] R. Sachdeva, S. Johar and E. Piccienlli,“Adding SVC Spatial Scalability to Existing H.264/AVC Video,”8th IEEE/ACIS International Conference on Computer and Information Science, Shangai, 2009.
[8] Joint Video Team JSVM reference software, Version 9.17.
[9] A. Dziri, A. Diallo, M. Kieffer and P. Duhamel,“P-Picture Based H.264 AVC to H.264 SVC Temporal Transcoding,”International Wireless Communications and Mobile Computing Conference, 2008.
[10] H. Al-Muscati and F. Labeau,“Temporal Transcoding of H.264/AVC Video to the scalable format,”Image Processing Theory Tools and Applications (IPTA), 2010.
[11] R. Garrido-Cantos, J. De Cock, J.L. Martinez, S. Van Leuven, and P. Cuenca,“Motion-Based Temporal Transcoding from H.264/AVC-to-SVC in Baseline Profile,”IEEE Transaction on Consumer Electronics, Feb. 2011.
[12] J. Youn, M. T- Sun, and Chia-Wen Lin,“Motion Vector Refinement for Transcoding,”IEEE Transactions on Multimedia, vol. 1, no. 1, pp. 30-40, March 1999.
[13] Joint Model JM reference software. Version 17.1.
[14] B. Shen, I. K. Ishwar, and V. Bhaskaran, “Adaptive motion-vector re-sampling for compressed video downscaling,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 6, pp. 929–936, Sep. 1999.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文