簡易檢索 / 詳目顯示

研究生: 鄭孟璿
論文名稱: 針對多媒體嵌入式數位訊號處理器基於硬體特殊最佳化之超長指令集後處理編譯器框架
A VILW-Based Post Compilation Framework for Multimedia Embedded DSPs with Hardware Specific Optimizations
指導教授: 鍾葉青
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 32
中文關鍵詞: 超長指令集編譯器最佳化數位訊號處理器編譯器最佳化
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在高性能和低功耗多媒體嵌入式系統設計中,由於超長指令集的嵌入式觸位訊號處理器需要編譯器來進行排程來增加指令層次平行所以變得越來越流行而且變演一個重要的角色。基於這個理由,我們需要最佳化的嵌入式數位訊號處理器編譯器來產生對性能、功率消耗、面積和生產力而言是有效率的程式碼。由於可以增加數位訊號處理器特殊硬體的利用率,最佳化的嵌入式數位訊號處理器編譯器可以避免設計師手動調叫應用程式 ─ 這個行為會增加而外的消費並且延長上市的時間。在本論文中,我們開發出一套後處理編譯器框架。它可以由執行時期資訊來最佳化別的編譯器的結果並且增加數位訊號處理器特殊硬體的利用率。在實驗的結果,我們可以看到本框架可以最佳化Blackfin GCC 3.4和VDSP++ 4.5所產生的執行檔並且得到平均17.56%和8.8%的效能改進。除此之外,本框架還能最佳化被手寫函式庫最佳化過的真實多媒體程式H.264並且得到5.8%的效能改進。


    In the high performance and low power multimedia embedded system design, the VLIW-based embedded DSPs which need compiler to exploit the ILP become popular and play an important role today. For this reason, we need optimizing embedded DSP compilers that can generate capable code with efficiency in terms of performance, power, area, and productivity in order to use embedded DSPs effectively. With exploiting the specific hardware feature of DSPs, the embedded DSP compilers can avoid designer to optimize applications on hand which will increase the time-to-market without lost performance. In this paper, we show that using the proposed post compilation framework, we can exploit hardware specific features of DSPs with runtime information to optimize the results of other compilers. In the simulation results, we demonstrate that the proposed framework can optimize the programs optimized by another compiler Blackfin GCC 3.4 and VDSP++ 4.5 with optimization level 3 and get additional 17.56% and 8.8% performance on average. We also can get additional 5.8% performance on average of the real multimedia program H.264 which is optimized by hand-turned DSP library.

    Chapter 1. Introduction 1 Chapter 2. Related work 5 Chapter 3. The Proposed Post Compilation Framework 8 Chapter 4. Performance Evaluation and Simulation Results 24 Chapter 5. Conclusions 30 References 31

    [1] The Analog Devices, Inc Website, http://www.analog.com/en/ , 1995
    [2] A.V. Aho, M.S. Lam, R. Sethi, and J. D. Ullman, “Compilers Principles, Techniques, and Tools 2nd Edition”, Addison-Wesley, 2006.
    [3] D.F. Bacon, S.L. Graham and O.J. Sharp. “Compiler Transformations for High-Performance Computing.”, ACM Computing Surveys, Dec. 1994.
    [4] P. P. Chang, S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu, “IMPACT: An architectural framework for multiple-instruction-issue processors”, Proc. 18th. Int. Symp. Computer ARchitecutre. 1996
    [5] H. Falk, “Control Flow Optimization by Loop Nest Splitting at the Source Code Level”, Research Report No 773, Oct. 2002.
    [6] J.A Fisher, P. Faraboschi, C. Young, “Embedded Computing: a VLIW approach to architecture, compilers and tools”, Morgan Kaufmann, 2005.
    [7] C.W. Fraser, D.R. Hanson, and T.A. Proebsting, “Engineering a simple, efficient code-generator generator”, ACM Letters on Programming Languages and Systems, 213–226.
    [8] The GCC - the gnu compiler collection, http://gcc.gnu.org/ , 1987
    [9] J. C. Gyllenhaal, W. M. Hwu, and B. R. Rau, “Hmdes version 2.0 specification, Univ”, Illinois, Urbana, IL, Tech. Rep. IMPACT, 1996.
    [10] The H.264/AVC JM Reference Software, The Image Processing HHI, http://iphome.hhi.de/suehring/tml/, 2006
    [11] J.L. Hennessy and D.A. Patterson, “Computer Architecture: A quantitative approach 4th Edition”, Morgan Kaufmann, 2006.
    [12] C.E. Kozyrakis and D.A. Patterson, “Scalable Vector Processors for Embedded Systems”, IEEE Computer Society, 2003.
    [13] P. Marwedel and G. Goosens, Eds., “Code Generation for Embedded Processors”, Norwell, Kluwer, 1995.
    [14] S. Rajagopalan, S.P. Rajan, S. Malik, S. Rigo, G. Araujo, and K. Takayama, “A Retargetable VLIW Compiler Framework for DSPs With Instruction-Level Parallelism”, IEEE transactions on CAD of IC and System, VOL. 20, NO. 11, NOVEMBER 2001
    [15] S. Rajagopalan, M. Vachharajani, and S. Malik, “Handling irregular ILP within conventional VLIW schedulers using artificial resource constraints,” Proc. Int. Conf. Compilers, Architecture, and Sysnthesis for Embedded Systems, Nov. 2000, pp. 157–164.
    [16] D.A. Padua and M.J. Wolfe, “Advanced Compiler Optimizations for Supercomputers”, Communication of the ACM, Dec 1986.
    [17] K. Zhang, T. Zhang and S. Pande, “Binary Translation to Improve Energy Efficiency through Post-pass Register Re-allocation”, Proceedings of the 4th ACM international conference on Embedded software, 2004
    [18] V. Zivojnovic, J. M. Velarde, C. Schläger, and H. Meyer, “DSP-stone: A DSP-oriented benchmarking methodology,” Proc. Int. Conf. Signal Processing Applications and Technology, Oct. 1994, pp. 715–720.
    [19] M.A.R. Saghir, P. Chow, and C.G. Lee, “Application-driven design of DSP architectures and compilers, Acoustics, Speech, and Signal Processing”, ICASSP-94., 1994.
    [20] R. Kumar, A. Gupta, B.S. Pankaj, M. Ghosh, and P.P. Chakrabarti, “Post-Compilation Optimization for Multiple Gains with Pattern Matching”, ACM SIGPLAN Notices, 2005.
    [21] S.S. Liao, P.H. Wang, H. Wang, G. Hoflehner, D. Lavery, and J.P. Shen, Post-Pass Binary Adaptation for Software-Based Speculative Precomputation, ACM PLDI’02, June, 2002
    [22] F. Angiolini, F. Menichelli, A. Ferrero, L. Benini, and M. Oliveri. “A Post-Compiler Approach to Scratchpad Mapping of Code”, International Conference on Compilers, Architectures and Synthesis of Embedded Systems (CASES 2004), September 2004.
    [23] The Analog Devices, Visual DSP++, Website, http://www.analog.com/en/ , 1995
    [24] M. Suzuki, N,Fujinami, T. Fukuoka, T. Watanabe, I. Nakata, “SIMD Optimization in COINS Compiler Infrastructure”, Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems, 2005

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE