研究生: |
陳柏村 Chen, Bo-Cun |
---|---|
論文名稱: |
可調性編碼中依據影像內容導向之空間可調性方法 Content-Aware Spatial Scalability for Scalable Video Coding |
指導教授: |
林嘉文
Lin, Chia-Wen |
口試委員: |
林銀議
Lin, Yin-Yi 簡韶逸 Chien, Shao-Yi 彭文孝 Peng, Wen-Hsiao 林嘉文 Lin, Chia-Wen |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 英文 |
論文頁數: | 52 |
中文關鍵詞: | 空間可調性 、視訊畫面調整 、視訊畫面濃縮 、層間預測 |
外文關鍵詞: | H.264/SVC, Spatial Scalability, Inter-layer Prediction |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技日新月異,能夠觀賞多媒體視訊內容的裝置也越來越多了。為了滿足不同裝置撥放影音串流的需求,MPEG與VCEG制定了一種以H.264/AVC為基礎的可調式視訊編碼,其目的是希望編碼出的Bitstream能分別或合併提供時間可調性、空間可調性與品質可調性,使用者可以根據網路通道的情形與裝置的能力,從中擷取出適合的Bitstream來進行網路串流。然而,H.264/SVC提供的空間可調性僅支援Cropping或Uniform-Scaling來產生多種低解析度的視訊內容,其可能會在產生多種低解析度的視訊內容時造成畫面資訊的遺失、物體變形或者無法保持重要物體的大小。因此,我們希望結合視訊畫面濃縮技術與H.264/SVC的空間可調性,以確保在產生低解析度的視訊內容時,能夠保留影像上人眼感興趣的部分,僅縮小或捨去人眼比較不感興趣的部分,進而保留原始畫面想要表達的意涵。
本篇論文提出了可調性編碼中依據影像內容導向之空間可調性方法。我們首先使用基於全景接圖引導之視訊畫面濃縮技術來保留空間基本層的重要內容。此外我們提出了低負擔的附加訊息編碼器與數種非均值層間預測器來減輕編碼空間增強層的位元率負擔。從實驗結果證實了我們的方法不僅可以在低解析度的視訊內容保有較佳的主觀品質,而且平均只增加了4.17%-4.98%的位元率。
The scalable extension of H.264/AVC (SVC) supports video cropping or uniform-scaling to create different lower resolution video content. However, it will cause information loss, important object deformation or unable to keep important object size in the different lower resolution. Therefore, we want to combine video retargeting with spatial scalability of the H.264/SVC to make sure generating different lower resolution video content can keep essential visual regions and condensing unimportant content.
In this thesis, we proposed content-aware spatial scalability for scalable video coding. First of all, we use a mosaic-guide video retargeting method to preserve the important content in the spatial base layer. Moreover, we proposed a low overhead side information coder and several non-homogeneous interlayer prediction coding tools to mitigate the bit-rate overhead in the spatial enhancement layer. The experimental results demonstrate the proposed method not only preserves subjective quality of important content in the lower resolution sequence, but also only has an average 4.17%-4.98% bit-rate overhead.
[1] ITU-T and ISO/IEC JTC 1: Advanced Video Coding for Generic Audiovisual Services. ITU-T Rec. H.264/AVC and ISO/IEC 14496-10 (including SVC extension), March 2009.
[2] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable extension of the H.264/MPEG-4 AVC video coding standard,” IEEE Trans. Circuits. Syst. Video Technol., vol. 17, no. 9, pp. 1103–1120, Sept. 2007.
[3] C. A. Segall and G. J. Sullivan, “Spatial scalability within the H.264/AVC scalable cideo coding extension,” IEEE Trans. Circuits. Syst. Video Technol., vol. 17, no. 9, pp. 1121–1135, Sept. 2007.
[4] T.-C. Yen, C.-M. Tsai, and C.-W. Lin, “Maintaining temporal coherence in video retargeting using mosaic-guided scaling,” IEEE Trans. Image Process., vol. 20, no. 8, pp. 2339–2351, Aug. 2011.
[5] C.-M. Tsai, T.-C. Yen, and C.-W. Lin, "Mosaic-guided video retargeting for video adaptation," in Proc. Conference on Applications of Digital Image Processing XXXIV, SPIE Optics+Photonics 2011, Aug. 2011, San Diego, CA, USA.
[6] Z. Lu, W. Lin, X. Yang, E. Ong, and S. Yao, “Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation,” IEEE Trans. Image Process., vol. 14, no. 11, pp. 19281942, Nov. 2005.
[7] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
[8] R. Szeliski, “Image alignment and stitching: a tutorial,” Foundations and Trends in Computer Graphics and Vision (FTCGV), vol. 2, no. 1, pp. 1–104, 2006.
[9] P. F. Felzenszwalb and D. P. Huttenlocher, "Efficient Graph-Based Image Segmentation," Int. J. Comput. Vis., vol. 59, no. 2, Sept. 2004.
[10] T. Ren, Y. Liu and G. Wu, "Image retargeting based on global energy optimization,” in Proc. IEEE Int. Conf. Multimedia Expo, pp. 406409. June 2009, New York, USA.
[11] Y.-S. Wang, H. Fu, O. Sorkine, T.-Y. Lee, and H.-P. Seidel, “Motion-aware temporal coherence for video resizing,” ACM Trans. Graphics, vol. 28, no. 5, 2009.
[12] M. Décombas, F. Capman, E. Renan, F. Dufaux, and B. Pesquet-Popescu, "Seam carving for semantic video coding," in Proc. Conference on Applications of Digital Image Processing XXXIV, SPIE Optics+Photonics 2011, Aug. 2011, San Diego, CA, USA.
[13] P. Krähenbühl, M. Lang, A. Hornung, and M. Gross, “A system for retargeting of streaming video,” ACM Trans. Graphics, vol. 28, no. 5, 2009.
[14] A. Smolic, Y. Wang, N. Stefanoski, M. Lang, A. Hornung, and M. H. Gross, “Non-linear warping and warp coding for content-adaptive prediction in advanced video coding applications,” in Proc. Int. Conf. Image Process., Sept. 2010, pp.4225–4228, Hong Kong, China.
[15] Y. Wang, N. Stefanoski, X. Fang, A. Smolic, “Content-Adaptive Spatial Scalability for Scalable Video Coding,” Proc. PCS, Nagoya, Japan, Dec. 7-10, 2010.
[16] Y. Wang, N. Stefanoski, M. Lang, A. Hornung, A. Smolic, and M. Gross, “Extending SVC by content-adaptive spatial scalability,” in Proc. Int. Conf. Image Process., Sept. 2011, pp. 3493–3496, Brussels, Belgium.
[17] Joint Video Team JSVM reference software, Version 9.18.
[18] A. Segall, Upsampling and down-sampling for spatial scalability, Joint Video Team, Doc. JVT-R070, Jan. 2006
[19] G. Bjontegaard, “Calculation of average PSNR difference between RD curves”, ITU-T Q.6/16, Doc. VCEG-M33, Apr. 2001.