簡易檢索 / 詳目顯示

研究生: 黃彥傑
Huang, Yen-Chieh
論文名稱: Temporal Transcoding from H.264/AVC to SVC with Hierarchical Bidirectional Prediction
從H.264/AVC到具雙向階層式預測之SVC的時間視訊轉碼
指導教授: 林嘉文
Lin, Chia-Wen
陳永昌
Chen, Yung-Chang
口試委員: 葉家宏
Yeh, Chia-Hung
林嘉文
Lin, Chia-Wen
陳永昌
Chen, Yung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 55
中文關鍵詞: 視訊轉碼階層式雙向預測時間轉碼
外文關鍵詞: Video Trancoding, H.264/AVC, H.264/SVC, Hierarchical Bidirectional Prediction, Temporal Transcoding
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 可調式視訊編碼H.264/SVC是H.264/AVC所延伸的一個標準,由於它在編碼串流時使用了層間的概念,所以能分別或合併提供時間可調性、畫面可調性與品質可調性。由於SVC可調性的特性,能適時的調整串流以支援各種不同的終端裝置及各式各樣的網路通道。但 SVC可調性的特性,需在編碼端藉由探討層與層間的相依性才能提供此特性。因此現今普遍存在的H.264/AVC串流內容由於編碼端本質缺法此種特性,故無法提供可調性的功能。在過去的研究裡,大部分的技術多在探討MPEG-2與H.264/AVC間的視訊轉碼,較少在討論H.264/AVC到SVC的視訊轉碼。但,對於廣播業者與視訊內容提供者,此種可調性的功能對它們的資源節省是有相當大的幫助,因此他們希望編碼出的視訊串流具有此功能。
      在這篇論文,我們提出了從H.264/AVC到具雙向預測之H.264/SVC的時間視訊轉碼。轉碼器首先對輸入的H.264/AVC串流解碼。而我們解碼輸入的H.264/AVC串流能得到宏塊編碼的模式以及宏塊的位移向量,再透過適當的演算法調整這些資訊進而能加速SVC編碼的速度。其中我們提出了宏塊決策演算法,此演算法能減少候選的宏塊模式近而加速編碼的過程。亦提出了宏塊位移向量演算法,基於充分利用輸入的H.264/AVC串流資訊,我們能加速位移向量估測的過程。而實驗結果證實了我們能大大的降低編碼的複雜度,同時保有接近理想的視訊品質。


    The scalable extension (SVC) of H.264/AVC uses a notion of layers within the encoded bitstream for providing temporal, spatial and quality scalability, separately or combined. This scalability allows adaptation depending on the scenarios with different devices and heterogeneous networks. The SVC design requires scalability to be provided at the encoder side by exploiting inter-layer dependencies during encoding. This implies that existing H.264/AVC content cannot benefit from the scalability tools in SVC due to the lack of intrinsic scalability provided in the bitstream at encoding time. Since a lot of technical and financial effort is currently being spent on the migration from MPEG-2 equipment to H.264/AVC, it is unlikely that a new migration to SVC will occur in the short term. Due to broadcaster and content distributors want to have scalable bitstreams at their disposal, efficient technique for migration of single-layer content to a scalable format are desirable.
    In this thesis, an approach for temporal transcoding from H.264/AVC to SVC with hierarchical bidirectional prediction is discussed. The input H.264/AVC bitstream is fully decoded by the transcoder. Macroblock coding mode and motion vectors are extracted from the input and adjusted to encode the output bitstream. The mode decision algorithm is proposed to reduce the candidate coding modes and the motion vector decision algorithm is proposed to obtain the output motion vector based on the input motion vector. As a result, a significant decrease in computational complexity is achieved, while maintaining a close to optimum compression efficiency.

    Table of Contents Abstract i Table of Contents ii List of Figures iv List of Tables vii Chapter 1: Introduction 1 1.1 Overview of Scalable Video Coding 1 1.2 Motivation 3 1.3 Thesis organization 5 Chapter 2: Related Works 6 2.1 Quality-SNR Scalability 6 2.2 Spatial Scalability 7 2.3 Temporal Scalability 7 2.3.1 Temporal Scalability in H.264/SVC 7 2.3.2 Different Approaches and Architectures of Video Transcoding from AVC-to-SVC with Temporal Scalability 9 Chapter 3: Temporal Transcoding from H.264/AVC to SVC with Hierarchical Bidirectional Prediction 17 3.1 Input and Output GOP Structures 17 3.2 H.264/SVC Encoder Analysis 18 3.3 Mode Decision 21 3.3.1 Mode Decision of Top Layer of Hierarchical B Structure 22 3.3.2 Mode Decision of Non-top Layer of Hierarchical B Structure 26 3.4 Motion Vector Decision 29 3.4.1 Forward Motion Vectors Decision 31 3.4.2 Backward Motion Vectors Decision 34 3.4.3 Variable Block Size of H.264/AVC 37 3.5 Overall Architecture of the Proposed Transcoder 39 Chapter 4: Simulation Results 40 4.1 Software Simulation Environment 40 4.2 The Results of Simulation 41 Chapter 5: Conclusions 52 5.1 Conclusions 52 References 54 List of Figures Fig 1.1: Video adaptation operation 3 Fig 1.2: Hierarchical prediction structures for enabling temporal scalability. (a) Hierarchical B-picture structure. (b) Nondyadic hierarchical structure. (c) Zero-delay structure 3 Fig 1.3: (a) H.264/AVC encoder structure. (b) H.264/SVC encoder structure. 5 Fig 1.4: Decoding and reencoding 5 Fig 1.5: Information reuse 5 Fig 2.1: Hierarchical prediction structures for enabling temporal scalability. (a) Hierarchical B-picture structure. (b) Nondyadic hierarchical structure. (c) Zero-delay structure 8 Fig 2.2: Hierarchical B prediction structure with three temporal layers (TL) 9 Fig 2.3: AVC to SVC Transcoder architecture in [9] 10 Fig 2.4: AVC to SVC simplified transcoder architecture in [9] 11 Fig 2.5: AVC to SVC Transcoder architecture in [10] 12 Fig 2.6: Input AVC bitstream GOP structure and output SVC bitstream GOP structure in [10] 12 Fig 2.7: The algorithm of motion vectors mapping in [10] 13 Fig 2.8: The proposed transcoder in [11] 14 Fig 2.9: Reduced search range in [11] 14 Fig 2.10: MB partition of H.264/AVC (left) and H.264/SVC (right) in [11] 15 Fig 2.11: MB in H.264/AVC with its MVs and the matching MB in SVC with its corresponding MVs in [11] 15 Fig 3.1: An example of the input AVC bitstream structure and the output SVC bitstream structure of GOP = 4 18 Fig 3.2: Motion estimation process flow 20 Fig 3.3: Top layers of hierarchical B structure which are marked by a circle and non-top layers of hierarchical B structure which are marked by a rectangular 22 Fig 3.4: Example of mode decision 23 Fig 3.5: The statistics of the mode decision in AVC encoder corresponds to the percentages of modes of SVC encoder: Top layer case 24 Fig 3.6: The flow chart of mode decision of top layer of hierarchical B structure 26 Fig 3.7: The statistics of the mode decision in AVC encoder corresponds to the percentages of modes of SVC encoder: Non-top layer case 28 Fig 3.8: The flow chart of mode decision of non-top layer of hierarchical B structure 29 Fig 3.9: Motion Vector Decision based on the input AVC motion vectors 30 Fig 3.10: The Motion vector composition 31 Fig 3.11: The motion vector composition by the algorithm of FDVS 33 Fig 3.12: The cases of forward prediction 34 Fig 3.13: Backward motion vector based on the forward prediction 35 Fig 3.14: The example of the general BMPA 36 Fig 3.15: The example of multiple motion vectors in a macroblock 37 Fig 3.16: The method of Partition Merge 38 Fig 3.17: The flow chart of the proposed transcoder 39 Fig 4.1: RD curve of the Soccer sequence in CIF resolution with GOP = 16 42 Fig 4.2: RD curve of the Stefan sequence in CIF resolution with GOP = 16 42 Fig 4.3: RD curve of the Football sequence in CIF resolution with GOP = 16 42 Fig 4.4: RD curve of the Hall sequence in CIF resolution with GOP = 16 43 Fig 4.5: RD curve of the Hall sequence in CIF resolution with GOP = 16 43 Fig 4.6: RD curve of the Foreman sequence in CIF resolution with GOP = 16 43 Fig 4.7: RD curve of the Crew sequence in CIF resolution with GOP = 16 44 Fig 4.8: RD curve of the Soccer sequence in CIF resolution with GOP = 8 45 Fig 4.9: RD curve of the Stefan sequence in CIF resolution with GOP = 8 46 Fig 4.10: RD curve of the Football sequence in CIF resolution with GOP = 8 46 Fig 4.11: RD curve of the Hall sequence in CIF resolution with GOP = 8 46 Fig 4.12: RD curve of the Harbour sequence in CIF resolution with GOP = 8 47 Fig 4.13: RD curve of the Foreman sequence in CIF resolution with GOP = 8 47 Fig 4.14: RD curve of the Crew sequence in CIF resolution with GOP = 8 47 Fig 4.15: RD curve of the Crew sequence in QCIF resolution with GOP = 8 49 Fig 4.16: RD curve of the Harbour sequence in QCIF resolution with GOP = 8 49 Fig 4.17: RD curve of the Soccer sequence in QCIF resolution with GOP = 8 49 Fig 4.18: RD curve of the Crew sequence in 4CIF resolution with GOP = 16 50 Fig 4.19: RD curve of the Harbour sequence in 4CIF resolution with GOP = 16 51 Fig 4.20: RD curve of the Soccer sequence in 4CIF resolution with GOP = 16 51 List of Tables Table 3.1: The statistics of the mode decision (top layer) 25 Table 3.2: The statistics of the mode decision (non-top layer) 28 Table 4.1: The proposed method compared to the CPDT with exhaustive motion estimation(GOP = 16 and CIF resolution) 44 Table 4.2: The proposed method compared to the CPDT with fast motion estimation (GOP = 16 and CIF resolution) 44 Table 4.3: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 8 and CIF resolution) 48 Table 4.4: The proposed method compared to the CPDT with fast motion estimation (GOP = 8 and CIF resolution) 48 Table 4.5: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 8 and QCIF resolution) 50 Table 4.6: The proposed method compared to the CPDT with fast motion estimation (GOP = 8 and QCIF resolution) 50 Table 4.7: The proposed method compared to the CPDT with exhaustive motion estimation (GOP = 16 and 4CIF resolution) 51 Table 4.8: The proposed method compared to the CPDT with fast motion estimation (GOP = 16 and 4CIF resolution) 52

    [1] ITU-T and ISO/IEC JTC 1: Advanced Video Coding for Generic Audiovisual Services. ITU-T Rec. H.264/AVC and ISO/IEC 14496-10 (including SVC extension) , March 2009.
    [2] S. –F Chang and A. Vetro,“Video Adaptation: Concept, Technologies, and Open Issues,”Procceeding of the IEEE, Vol 93, No. 1, p148 -158, Jan. 2005
    [3] H. Schwarz, D. Marpe, and T. Wiegand,“Overview of the scalable video coding extension of the H.264/AVC Standard,”IEEE Trans. Circuits Syst. Video Technol. 17(9), 1103- 1120, Sep. 2007.
    [4] J. Xin, C.-W Lin, and M.-T.Sun,“Digtal video transcoding,”Proceedings of the IEEE, vol.93, no.1, pp. 84- 97, 2005.
    [5] H. Shen, S. Xiaoyan, F. Wu, H. Li and S. Li,“Transcoding to FGS Streams from H.264/AVC Hierarchcial B-Pictures,”IEEE Int. Conf. Image Proccessing, Atlanta, 2006.
    [6] J. De Cock, S. Notebaert, P. Lambert and R. Van de Walle,“Architectures of Fast Transcoding of H.264/AVC to Quality-Scalable SVC Streams,”IEEE Transactions on Multimedia vol. 11 n. 7, pp. 1209 -1224, 2009.
    [7] R. Sachdeva, S. Johar and E. Piccienlli,“Adding SVC Spatial Scalability to Existing H.264/AVC Video,”8th IEEE/ACIS International Conference on Computer and Information Science, Shangai, 2009.
    [8] Joint Video Team JSVM reference software, Version 9.17.
    [9] A. Dziri, A. Diallo, M. Kieffer and P. Duhamel,“P-Picture Based H.264 AVC to H.264 SVC Temporal Transcoding,”International Wireless Communications and Mobile Computing Conference, 2008.
    [10] H. Al-Muscati and F. Labeau,“Temporal Transcoding of H.264/AVC Video to the scalable format,”Image Processing Theory Tools and Applications (IPTA), 2010.
    [11] R. Garrido-Cantos, J. De Cock, J.L. Martinez, S. Van Leuven, and P. Cuenca,“Motion-Based Temporal Transcoding from H.264/AVC-to-SVC in Baseline Profile,”IEEE Transaction on Consumer Electronics, Feb. 2011.
    [12] J. Youn, M. T- Sun, and Chia-Wen Lin,“Motion Vector Refinement for Transcoding,”IEEE Transactions on Multimedia, vol. 1, no. 1, pp. 30-40, March 1999.
    [13] Joint Model JM reference software. Version 17.1.
    [14] B. Shen, I. K. Ishwar, and V. Bhaskaran, “Adaptive motion-vector re-sampling for compressed video downscaling,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 6, pp. 929–936, Sep. 1999.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE