簡易檢索 / 詳目顯示

研究生: 高子揚
Kao, Tze-Yang
論文名稱: 支援多重影像規格雙向同時運算之低成本DCT轉換核心
A Low-cost Multi-Standard Simultaneous Forward and Inverse DCT Transform Core
指導教授: 張慶元
Chang, Tsin-Yuan
口試委員: 陳元賀
Chen, Yuan-Ho
洪進華
Hong, Jin-Hua
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2012
畢業學年度: 101
語文別: 中文
論文頁數: 89
中文關鍵詞: 離散餘弦轉換逆離散餘弦轉換多層次共用因子雙向同時運算H.264多重影像規格
外文關鍵詞: IDCT, Multi-level Factor Share, Simultaneous Transform, Multi-Standard
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • MPEG-1/2/4、H.264以 及 VC-1 是很廣泛被應用的影像壓縮系統。在本篇論文中我們使用了多層次共用因子 (Multi-level Factor Share) 以及分散式算數 (Distributed Arithmetic) 來建構多重標準之低成本 DCT ( Discrete Cosine Transform )與 IDCT ( Inverse Discrete Cosine Transform ) 運算。此架構可以支援四種型式的運算,包括 8 x 8 、8 x 4 、4 x 8 以及 4 x 4 轉換。我們利用多層次共用因子和分散式算數來共用係數矩陣電路,利用加法器 (Adder) 以及移位器 (Shifter) 取而代乘法器。不僅大幅降低係數乘法運算中的重複片段,還進一步減少加法樹 (Adder Tree) 之所需加法器之數量,如此節省面積之消耗。此外,我們利用 DCT 與 IDCT 其係數矩陣的相似性質,在電路中以時間交錯的方式重複使用同一塊係數矩陣電路。不僅降低正向與逆向餘弦運算所需之面積成本,還可以同時運行 DCT 與 IDCT 轉換,維持高輸出率( Throughput Rate ),滿足即時 (Real-Time) 影像編碼的需求。而透過轉置記憶體( Transpose Memory )的轉換,我們可以將資料重新排序輸出二維資料,並且使用SORT1、iSORT1電路加以排序,讓一維與二維資料可以分時交錯並且重複使用 1-D 核心,即使用一個1-D核心同時運算 DCT 與 IDCT 之一維與二維資料。如此整個架構中僅需要33個加法器即可完成所有運算。其中 MPEG-4 IDCT 可以滿足 IEEE 1180-1990 之精確度要求,各標準運算之 PSNR 也都有到50~60 dB 之間,為高精確度之多標準處理。透過 TSMC 0.18-um 的電路合成,在 Slow Model 下可以達到 227 MHz 之操作頻率,且面積 32K 的邏輯閘可以達到 454 M pixel/sec 之輸出率,如此便可以支援 HDTV (1920 x 1080P@60Hz) 之規格。


    Video and Image compression standard, such as MPEG-1/2/4、H.264 and VC-1, are widely used in video and image applications. In this thesis, multi-level factor share and distributed arithmetic are being used to build multi-standard DCT (Discrete Cosine Transform) and IDCT (Inverse Discrete Cosine Transform) transforms. The proposed architecture can process four transform types, including 8 x 8、8 x 4、4 x 8 and 4 x 4 transforms. We use multi-level factor share and distributed arithmetic to share the coefficient matrix circuits and we replace the multipliers with the adders and shifters. The result not only reduces the number of redundant parts in coefficients multiplication, but further reduces the needed adders in adder trees and leads to a low-cost design. Besides, based on the similarities of DCT and IDCT transforms, we reuse the same circuits to manipulate DCT and IDCT by interlaced sorting methods. Not only the cost of area is saved, but DCT and IDCT are also operated simultaneously to reach the high throughput rate and meet the demands of real-time system. By using the transpose memory, we are able to transpose the 1-D results into the 2-D input data. With SORT1 and iSORT1circuits, we further arrange the data of 1-D and 2-D in orders which will allow us to reuse the same 1-D core to compute 2-D data . That means we can compute DCT and IDCT’s 1-D and 2-D data in the same time with only one 1-D core. In this way, only 33 adders are needed to achieve the whole computations. The proposed Architecture can meet the precision of IEEE 1180-1990 MPEG-4 IDCT and get to high peak signal to noise ratio (about 50~60 dB) between different standards. We use TSMC 0.18-um process to implement this chip. The operating frequency will reach 227MHz in slow model and achieve 454MHz throughput rate with only 32K gate counts. In this manner, it can support HDTV (1920 x 1080P@60Hz) spec.

    1 介紹與動機 1 1.1 介紹……………………………………………..…………………………………...1 1.2 研究動機………………………………………..…………………………………...3 1.3 前人提出的方法…………………………………………..………………………...4 1.3.1 分散式算數……………………………………….…………………………..5 1.3.2 共用因子…………………………………………………………………….10 1.3.3 矩陣分解…………………………………………………………………….12 1.3.4 可支援雙維度運算………………………………………………………….12 1.4 本篇論文內容……………………………………………………………………...15 2 多層次共用因子分散式雙向運算 DCT 架構 16 2.1 二維 DCT/IDCT 演算法………………………………………………………….16 2.2 一維 DCT/IDCT 行列重組……………………………………………………….18 2.3 提出同時運算正反向 DCT 矩陣………………………………………………...22 2.3.1 提出偶部矩陣………………………………………………………………22 2.3.2 提出奇部矩陣………………………………………………………………26 iv 2.4 多層次共用因子分散式算數架構………………………………………………...30 2.4.1 分散式算數數學推導………………………………………………………30 2.4.2 共用因子數學推導…………………………………………………………33 2.4.3 提出多層次共用因子分散式算數演算法…………………………………35 2.5 提出多層次共用因子分散式雙向運算 DCT 電路架構 52 2.5.1 提出 SORT1 硬體電路架構……………………………………..………..53 2.5.2 提出 SORT2 硬體電路架構………………………………………………56 2.5.3 提出偶部硬體電路架構……………………………………………………59 2.5.4 提出奇部硬體電路架構……………………………………………………61 2.5.5 提出 iSORT2 硬體電路架構……………………………………………...65 2.5.6 提出 iSORT1 硬體電路架構……………………………………………...68 2.5.7 提出同時運算 DCT 和 IDCT 硬體電路架構…………………………...71 3 模擬結果與規格比較 73 3.1 模擬結果…………………………………………………………………………...73 3.2 規格………………………………………………………………………………...77 3.3 比較………………………………………………………………………………...79 4 結論與未來發展 83 4.1 結論………………………………………………………………………………...83 4.2 未來發展…………………………………………………………………………...84 文獻 85

    [1] Moving Picture Experts Group Web site, http://mpeg.chiariglione.org.
    [2] International Telecommunication Union Telecommunication Standardization Sector
    Web site, http://www.itu.int/rec/T-REC-H.264-201201-I/en.
    [3] Official Windows Media Web site, http://www.microsoft.com/windows/windows
    media/default.asp.
    [4] Video Compression standards web site, http://en.wikipedia.org/wiki/Video_compression#Video.
    [5] S. Srinivasan, P. Hsu, T. Holcomb, K. Mukerjee, S. L. Regunathan, B. Lin, J. Liang, M. C. Lee,
    J. Ribas-Corbera,” Windows Media Video 9: overview and applications ,” Original Research Article Signal Processing: Image Communication, Volume 19, Issue 9, pp. 851-875 October 2004.
    [6] Y. M. Lee, ” Fast Intermode Decision in H.264/AVC,” National Central University
    Department of Communication Engineering Thesis, July 2006.
    [7] M. Igarta,” A STUDY OF MPEG-2 AND H.264 VIDEO CODING,” Master of Science in Electrical and Computer Engineering of Purdue University, December 2004.
    [8] Introduction of VC-1 http://zh.wikipedia.org/wiki/VC-1.
    85
    [9] S. A. White, "Applications of distributed arithmetic to digital processing: A tutorial review," IEEE ASSP Mag., pp. 4-19. July 1989.
    [10] S. I. Uramoto, Y. Inoue, A. Takebatake, J. Takeda, Y. Yamashita, H. Terane, and M.
    Yoshimoto, “A 100-MHz 2-D Discrete Cosine Transform Core Processor,” IEEE Journal of Solid-State Circuits (JSSCC), vol. 27, no. 4, pp. 492-499, April 1992.
    [11] S. Yu and E. E. S. Jr., “DCT Implementation with Distributed Arithmetic,” IEEE Trans.
    Computers (TC), vol. 50, no. 9, pp. 985-991, September 2001.
    [12] C. Chen, T. S. Chang and C. W. Jen, “The IDCT processor on the adder-based distributed
    arithmetic,” in Proc. IEEE VLSI Circuit,1996 , pp. 36-37.
    [13] M. R. M. Rizk and M. Ammar, “Low Power Small Area High Performance 2D-DCT
    architecture, ” in Proc. IEEE International on Design and Test Workshop (IDT), 2007, pp. 120-125.
    [14] A. M. Shame, A. Chidanandan, W. Pan, and M. A. Bayoumi, “NEDA: A Low-Power
    High-Performance DCT Architecture,” IEEE Trans. Signal Processing (TSP), vol. 54, no. 3, pp. 955-964, March 2006.
    [15] C. Y. Huang, L. F. Chen, and Y. K Lai, “A High-Speed 2-D Transform Architecture with
    Unique Kernel for Multi-Standard Video Applications,” in Proc. IEEE International Symp Circuits and Systems (ISCAS), 2008, pp. 21-24.

    86
    [16] H. Chang, S. Kim, S. Lee, and K. Cho, “Design of Area-efficient Unified Transform Circuit for
    Multi-standard Video Decoder,” in Proc. IEEE International SoC Design Conference (ISOC), 2009, pp. 369-372.
    [17] H. Qi, Q. Huang, and W. Gao, “A Low-Cost Very Large Scale Integration Architecture For
    Multistandard Inverse Transform,” IEEE Trans. Circuits and Systems II (TCSII), Express Briefs, vol. 57, no. 7, pp. 551-555, July 2010.
    [18] S. Lee and K. Cho, “Circuit Implementation for Transform and Quantization Operation of
    H.264/MPEG-4/VC-1 Video Decoder,” in Proc. IEEE Design & Technology of Integrated Systems (DTIS), 2007, pp. 102-107.
    [19] S. Lee and K. Cho, “Architecture of Transform Circuit for Video Decoder Supporting Multiple
    Standards,” IEEE Institution of Engineering and Technology (IET), vol. 44, no.4, pp. 274-275,
    February 2008.
    [20] C. P. Fan and G. A. Su, “Fast Algorithm and Low-Cost Hardware-Sharing Design of Multiple
    Integer Transforms for VC-1,” IEEE Trans. Circuits and Systems II (TCSII), Express Briefs, vol. 56, no. 10, pp. 788-792, October 2009.
    [21] T. Kuroda, T. Fujita, S. Mita, T. Nagamatsu, S. Yoshioka, K. Suzuki, F. Sano, M. Norishima,
    M. Murota, M. Kako, M. Kinugawa, M. Kakumu, and T. Sakurai, “ A 0.9v , 150-MHz, 10-mw4mm 2 , 2-D Discrete Cosine Transform Core Processor with Variable Threshold-Voltage(VT) Scheme, ” IEEE J. Solid-State Circuit., vol. 31, pp. 1770-1779, November 1996.
    87
    [22] N. J. August and D. S. Ha , “ Low Power Design of DCT and IDCT for Low Bit Rate Video
    Codecs, ” IEEE Trans. Multimedia, vol. 6, pp. 414-422, June 2004.
    [23] J. I. Guo, R. C. Ju, and J. W. Chen, “ An Efficient 2-D DCT/IDCT Core Design Using Cyclic
    Convolution and Adder-Based Realization, ” IEEE Trans. Circuits Syst. Video Technol., vol. 14, pp. 416-428, Apr. 2004.
    [24] A. Madisetti and A. N. Wilson Jr, “ A 100 Mhz 2-D 8×8 DCT/IDCTprocessor for HDTV
    applications, ” IEEE Trans. Circuits Syst. VideoTechnol. , vol. 5, pp. 158-165, Apr. 1995.
    [25] C. Y. Huang, L. F. Chen, and Y. K. Lai, “A High-Speed 2-D Transform Architecture with
    Unique Kernel for Multi-Standard Video Applications,” in Proc. IEEE International Symp Circuits and Systems (ISCAS), 2008, pp. 21-24.
    [26] S. Lee and K. Cho, “Design of High-Performance Transform and Quantization Circuit for
    Unified Video CODEC, ” in Proc. IEEE Asia Pacific conference on Circuit and Systems (APCCAS), 2008, pp. 1450-1453.
    [27] S. Kim, H. Chang, S. Lee and K. Cho, “VLSI design to unify DCT and IQ circuit for
    Multistandard video decoder,” in Proc. IEEE int. Symp Integrated Circuits(ISIC), 2009, pp. 328-331.
    [28] K. Wahid M. Martuza; M. Das; C. McCrosky, “Resource shared architecture of multiple
    transforms for multiple video codecs,” in Proc. IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), 2011, pp. 947-950.
    88
    [29] Martuza, Muhammad, “A fast hybrid DCT architecture supporting H.264, VC-1, MPEG-2,
    AVS and JPEG codecs,” in Proc. IEEE Information Science, Signal Processing and their Applications (ISSPA), 2012, pp. 545-549.
    [30] K. Kim and J. S. Koh, “An area efficient DCT architecture for MPEG-2 video encoder,” IEEE
    Trans. Consumer Electronics, Vol. 45, No. 1, pp. 62-67, February 1999.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE