簡易檢索 / 詳目顯示

研究生: 陳佳育
Chen,Chia-Yu
論文名稱: Optimization of Memory Access for H.264/AVC Decoder on Embedded DSP Core
基於嵌入式數位訊號處理器之H.264/AVC解碼器記憶體存取最佳化
指導教授: 吳仁銘
Wu,Jen-Ming
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2008
畢業學年度: 97
語文別: 英文
論文頁數: 64
中文關鍵詞: H.264/AVC解碼器最佳化記憶體存取嵌入式數位訊號處理器
外文關鍵詞: H.264/AVC, decoder optimization, Memory Access, Embedded DSP
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著時代的進步,科技也日益發達,人們對於多媒體的需求也越來越高。當多媒體影像的解析度越來越高的時候,我們就必須要用更大的記憶體空間去儲存或者是用更有效率的影像編碼技術去壓縮影像資料。隨著最新的影像編碼技術H.264/AVC的出現,它提供了比之前編碼壓縮技術如MPEG-2、H.263更好的壓縮效率與影像品質。相對的H.264/AVC計算複雜度也較之前的編碼壓縮技術高上許多,因此如果要以軟體方式實現一個H.264/AVC即時解碼器的話,將會需要一個運算能力更強大的處理器與更有效率的演算法。而主要H.264/AVC解碼器的效能瓶頸則是在記憶體的頻寬上,比起處理器時脈的快速發展,記憶體的存取速度卻沒有類似的提升;這個原因侷限了H.264/AVC解碼器的效能。
    有幾種方法可以去改善記憶體頻寬的問題,第一個就是提升記憶體的存取速度;第二個是使用階層式記憶體架構,藉由快取記憶體(cache)方式來增快資料存取的速度;最後一個則是改善演算法來降低在匯流排中傳送資料的數量。本論文在此提出一個新的H.264/AVC解碼流程,它可以針對具有快取記憶體的嵌入式系統,大量減少資料在外部匯流排傳輸的次數。藉由整合去區塊效應濾波器(Deblocking Filter, DF)與離散餘弦反轉換(Inverse Discrete Cosine Transform, IDCT)、反轉量化(Inverse Quantization, IQ),可以大量減少資料在匯流排上之傳輸。但因為這樣的整合,我們必須使用額外的記憶體去記錄Intra預測子(Intra predictor);相較於之前的方法,透過我們提出的架構可以節省約44%的Intra預測子。最後我們把提出來的架構實現在Starfish SoC平台上,這是一個由國立清華大學與國立交通大學共同合作設計出來的一個低功率高效能的數位訊號處理器。實驗結果顯示透過我們的方法,當H.264/AVC解碼器在解碼時,外部匯流排傳送的資料量可以減少35.5%。


    H.264/AVC standard provides enhanced coding efficiency for a wide range of application. It gives better compression efficiency than other existing video coding standard. But the computation complexity of H.264/AVC decoder is higher than others, thus a software-based real-time decoder requires a powerful processor and more efficient algorithms. The major performance bottleneck of software-based H.264/AVC decoder is memory bus bandwidth. Because the H.264/AVC reference software spends too much time for memory access and data transfer, so it’s necessary to deal with memory bandwidth.
    There are three ways to deal with performance bottleneck. One is to increase the memory bandwidth. Another is using the memory hierarchy structure to speedup the memory access time. The other way is to reduce number of data transfer on external memory bus. This thesis proposes a method for H.264/AVC software-based decoder to reduce the number of memory accesses especially for memory cache based DSP processor. Our method incorporates deblocking filter with IDCT&IQ process, thus we could reduce unnecessary load/store from external memory. According to this decoding flow, we have to add extra predictor memory for intra prediction. But we can save nearly 44% predictor memory compared with former scheme. Furthermore, we implement the H.264/AVC baseline profile decoder on the Starfish DSP platform. The Starfish DSP is a low power and high performance embedded DSP core developed by National Tsing Hua University (NTHU) and National Chiao Tung University (NCTU). The experimental results show that the cycles of memory access for the data transfer reduced by 35.5%.

    Abstract v Table of Contents vi List of Figures viii List of Tables x Chapter 1 Introduction 1 1.1 Overview 1 1.2 Motivation and Contribution 1 1.3 Organization 2 Chapter 2 Overview of H.264/AVC Standard 3 2.1 Overview of H.264/AVC Standard Basic 3 2.2 H.264/AVC Profiles and Levels 4 2.2.1 Profiles 4 2.2.2 Levels 5 2.3 H.264/AVC Video Codec 7 2.3.1 H.264/AVC Encoder 7 2.3.2 H.264/AVC Decoder 8 2.4 H.264/AVC Video Coding Standard 9 2.4.1 Intra Prediction 9 2.4.2 Inter Prediction 12 2.4.3 Deblocking Filter 15 2.4.4 Transform and Quantization 15 2.4.5 Entropy Coding 19 Chapter 3 H.264/AVC Deblocking Filter 21 3.1 Blocking Artifacts 21 3.2 H.264/AVC Deblocking Filter 22 3.2.1 Get Boundary Strength 23 3.2.2 The Filtering Thresholds for each block edge 24 3.2.3 Filtering process for each block edge 27 Chapter 4 Proposed Architecture 31 4.1 Overview of Related Research 31 4.2 Latest Related Work: [Kuo, 2007] [12] 32 4.3 Proposed Architecture 36 Chapter 5 Implementation and Performance Evaluation 42 5.1 The Introduction of Starfish SoC 42 5.1.1 The Architecture of Starfish DSP core 42 5.1.2 The software tool-chain of Starfish DSP core 43 5.1.3 The Starfish DSP Library 45 5.1.4 The Co-Verification Environment of Starfish DSP core 46 5.1.5 The Starfish SoC platform architecture 48 5.2 Experimental result 51 5.2.1 The usage of predictor memory 51 5.2.2 Implementation 53 5.2.3 The Starfish core simulation 54 5.2.4 The Starfish SoC platform RTL simulation 55 5.2.5 The Starfish SoC platform’s performance 56 5.2.6 The performance of Starfish DSP Library 59 Chapter 6 Conclusions and Future Work 61 6.1 Conclusions 61 6.2 Future Work 62 Bibliography 63

    [1] ITU-T Recommendation H.264 & ISO/IEC 14496-10, “Advanced Video Coding for Generic Audiovisual Services”, Version 1, May 2003.

    [2] ITU-R. BT.601-5: Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios, 1998. (Formerly CCIR601)

    [3] Iain E. G. Richardson, H.264 and MPEG-4 Video Compression-Video Coding for Next-generation Multimedia, John Wiley & Sons Ltd, 2004

    [4] P. List, A. Joch, J. Lainema, G. Bjontegaard, and M. Karczewicz, ”Adaptive Deblocking Filter”, IEEE Trans. Circuits Syst. Video Technol., vol.13, pp. 614–619, July 2003.

    [5] M. Horowitz, A. Joch, F. Kossentini, and A. Hallapuro, “H.264/AVC baseline profile decoder complexity analysis,” IEEE Trans. on Circuits and Systems for Video Tech., vol.13, no.7, pp. 704-716, Jul 2003.

    [6] H. Chen, R. Hu, and Y. Gao , “An effective method of deblocking filter for H.264/AVC”, IEEE International Symposium on Communications and Information Technologies, pp 1092-1095, Oct. 2007.

    [7] J. Lou, A. Jagmohan, D. He, L. Lu, and M.T. Sun, “Statistical Analysis Based H.264 High Profile Deblocking Speedup”, IEEE International Symposium on Circuits and Systems, pp 3143-3146, May 2007.

    [8] J. Lou, A. Jagmohan, D. He, L. Lu and, M.T. Sun, “High Speed H.264 High Profile Deblocking using Statistical Analysis and Logic Optimization”, IEEE International Conference on Multimedia and Expo, pp. 1918-1921, July 2007.

    [9] H. Yadav and K. R. Rao, “Optimization Of The Deblocking Filter In H.264 Codec For Real Time Implementation”, IEEE International Symposium on Communications and Information Technologies, pp 932-936, Sept. 2006.

    [10] M.O. Khan, U. Khan, S.A. Rahim, and S.I. Ali, “Optimization of motion compensation for H.264 decoder by pre-calculation”, The 8th IEEE International Multitopic Conference, pp. 55-60, Dec. 2004.

    [11] Q. Xue, J. Liu, S. Wang, and J. Zhao, “H.264/AVC baseline profile decoder optimization on independent platform”, IEEE International Conference on Wireless Communications, Networking and Mobile Computing, vol. 2, pp. 1253-1256, Sept. 2005.

    [12] C. H. Kuo, G. C. Huang, L. C. Chang, and B. D. Liu, “Source code flow optimization for H.264/AVC video decoder implementing on a low-cost embedded system platform”, IEEE Region 10 Conference, pp. 1-4, 2007.

    [13] Y. H. Moon, I. K. Eom, and S. W. Ha, "Efficient memory architecture for fast total_zeros decoding in H.264/AVC CAVLC decoder", IEEE International Conference on Multimedia and Expo, pp 65-68, April 2008.

    [14] W. Hao and M. Radetzki, "A Data Traffic Efficient H.264 Deblocking IP", IEEE International Symposium on Circuits and Systems, pp 3430-3433, May 2008.

    [15] N.R. Zhang, M. Li, Y. Y. Li, and W. C. Wu, "High Performance and High Efficiency Memory Management System for H.264/AVC Application in the Dual-Core Platform", International Joint Conference SICE-ICASE, pp 5719-5722, Oct. 2006.

    [16] A. Wise, R. Whitton, Y. Nemouchi, and B. T. Paul, “Model for estimating prediction bandwidth for H.26L”, JVT-E093, Oct. 2002.

    [17] H.264/AVC reference software (JM 9.8). Available: http://iphome.hhi.de/suehring/tml/

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE