簡易檢索 / 詳目顯示

研究生: 唐文力
Wen-Li Tang
論文名稱: 基於Starfish DSP 架構下,高效能,高彈性的模擬框架
A Fast, and Flexible Simulation Framework for Starfish DSP Architecture
指導教授: 鍾葉青
Yeh-Ching Chung
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 41
中文關鍵詞: 模擬器模擬與塑型超長指令字集數位訊號處理器
外文關鍵詞: Simulator, Simulation and Modeling, VLIW, DSP
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在新型微處理器開發的過程中,模擬器是做量化評估的唯一方法。然而,對
    模擬器的需求隨著開發人員的目的而有所出入。在實做開發中微處理器的模擬器
    時,有許多方面是必須做取捨的。其中最重要的決策就是模擬速度,精確度和模
    擬器彈性之間的考量。在這篇論文中,我們展示了經過深思熟慮的設計來達成各
    個目標之間的取捨。在速度上,我們提供了一個模擬器效能的模型,創新了一個
    可執行的控制流程圖(ECFG),並展示了其中最有效的手段來加快模擬速度。在
    精確度上,我們的模擬器能夠模擬時脈事件的發生,並且能夠模擬詳細的暫存器
    與記憶體資訊。在彈性上,我們提供了一種分層式的方法,讓模擬器有能力可以
    擴充為其他類型的應用程式。並展示如何應用分層的方式來將模擬器擴充為
    GNU 除錯器(GDB)以及測試框架。除了速度和彈性的主題之外,此篇論文亦蒐
    集、描述、和應用目前現有的模擬技術,來幫助模擬器架構設計師選擇最適合其
    環境的模擬技術,並將適合的技術應用在Starfish DSP 模擬器之上。


    In the development of new microprocessors, quantitative evaluation is possible
    only by using simulator. However, different users have various requirements for the
    simulator. The trade-off between simulation time and flexibility is the most
    important decision for any simulators. This paper presents a deliberate design to
    achieve those goals at the same time. For fast simulation speed, we provide a
    performance model to analysis a simulator, and innovate an executable control flow
    graph (ECFG). Furthermore, we also propose an effective way to improve the
    performance of a simulator. For high precision, our simulator is able to simulate the
    clock edge event and to expose detail register and memory information. For high
    flexibility, we present a layered approach that can be extended to any kind of emulator,
    even thought a GNU debugger (GDB) or a test suite. Besides the performance and
    extension issues, this paper also collects, describes, applies, and compares various
    simulation techniques to aid the simulator architects in selecting the most appropriate
    one. We apply these techniques to the Starfish DSP simulator.

    1 Introduction 1 2 Related Work 4 3 The Performance Model and Processor Architecture 6 3.1 Basic Simulation Algorithms and Performance Analysis 6 3.2 Starfish DSP Architecture 10 4 The Design of Our Simulator 13 4.1 Layered Approach of the Simulator Architecture 13 4.2 Our Simulator Design for Starfish DSP Core 14 4.2.1 Emulation Interface vs. Emulator Interface 15 4.3 Two Real Cases for Extensibility 17 4.3.1 GNU Debugger 17 4.3.2 Test Suite 18 5 The Performance Optimization Algorithms 19 5.1 Executable Control Flow Graph 19 5.1.1 The Concept of ECFG 20 5.1.2 The Execution Algorithm of ECFG 21 5.1.3 The Greedy Partition and Mapping Algorithm 24 6 Alternative of Optimization Algorithms 27 6.1 Direct Event-Driven Device vs. Proxy Event-Driven Device 27 6.2 Static Instruction Table vs. Dynamic Action Table 29 7 Experimental Result 34 7.1 The Experimental Environment 34 7.2 The Experimental Result 35 8 Conclusions 39 9 References 40

    [1] Todd Austin, “SimpleScalar Hacker’s Guide,”
    http://www.simplescalar.com/docs/hack_guide_v2.pdf
    [2] Nikolaos Bellas, Ibrahim N. Hajj, Constantine D. Polychronopoulos, George Stamoulis, “Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors,” IEEE Transactions on VLSI Systems, Vol. 8, No. 3, June, 2000.
    [3] Matthew Chidester and Alan George, “Parallel Simulation of Chip-Multiprocessor Architecture,” ACM Transactions on Modeling and Computer Simulation(TOMACS’02), pp.176-200,Vol. 12, No. 3, July 2002.
    [4] Murthy Durbhakula, Vijay S. Pai, and Sarita Adve, “Improving the Accuracy vs. Speed Tradeoff for Simulating Shard-Memory Multiprocessors with ILP Processors,” Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’99), 1999.
    [5] Fummi, F., Martini, S., Perbellini, G., and Poncino, M., “Native ISS-SystemC Integration for the Co-Simulation of Multi-Processor SoC,” Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’04), 2004.
    [6] Enric Gibert, Jesus Sanchez, Antonio Gonzalez, “Distributed Data Cache Designs for Clustered VLIW Processors,” IEEE, Transactions on Computers, Vol. 54, No. 10, October 2005.
    [7] J. Huang and D. Lilja, “An Efficient Strategy for Developing a Simulator for a Novel Concurrent Multithreaded Processor Architecture,” Proceedings of the 6th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’98), July, 1998.
    [8] H. S. Kim, N. Vijaykrishnan, M. Kandemir, M.J. Irwin, “A Framework for Energy Estimation of VLIW Architecture”, Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors (ICCD’01), IEEE, 2001.
    [9] B. Liskov and J. Wing., “Family Values: A Behavioral Notion of Subtyping,” ACM Transactions on Programming Languages and Systems, November 1994.
    [10] B. Ramakrishna Rau, “Iterative Modulo Scheduling: An Algorithm For Software Pipelining Loops,” Proceedings of the 27th Annual International Symposium on Microarchitecture, ACM, 1994
    [11] Robert C. Martin, “The Dependency Inversion Principle,” C++ Report, Vol. 8, No 6, pp. 61-66, May, 1996.
    [12] Robert C. Martin, “The Open-Closed Principle,” C++ Report, vol. 8, Jan. 1996.
    [13] Micro Signal Architecture
    http://blackfin.uclinux.org/docman/view.php/17/125/IOCRM%20Beta%201.2.pdf
    [14] J. A. Rowson, “Hardware/software cosimulation,” Proceedings of the Design Automation Conference (DAC’94), pp. 439-440, 1994.
    [15] M. Sami, D. Sciuto, C. Silvano, V. Zaccaria, and DE. e. Inf, “An Instruction-Level Energy Model for Embedded VLIW Architectures,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. pp. 998-1010, Vol. 21 ,No.9, September 2002.
    [16] A. Sampogna, DR Kaeli, D Green, M Silva, and CJ Sniezek, “Performance Modeling Using Object-Oriented Execution-Driven Simulation,” Proceedings of the 29th Annual Simulation Symposium (SS’96), 1996.
    [17] R.M. Stallman, “Using and Porting GNU CC,”
    http://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_toc.html
    [18] System Level Synthesis (SLS) group, TIMA lab, “Research Activities,” TIMA-Annual Report 2004, 2004
    [19] Starfish Website,
    http://nthucad.cs.nthu.edu.tw/~starfish
    [20] The International Technology Roadmap for Semiconductors, “Design”, 2005,
    http://public.itrs.net
    [21] Madhavi G. Valluri, Lizy K. John, Kathryn S. McKinley, “Low-Power, Low-Complexity Instruction Issue Using Compiler Assistance,” Proceedings of the 19th ACM International Conference on Supercomputing, (ICS’05), 2005.
    [22] Jingling Xue, Xavier Vera, “Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behavior,” IEEE Transactions on Computers, Vol. 53, No. 5, May 2004.
    [23] Joshua J. Yi and Lilja, D.J, “Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations,” IEEE Transactions on Computers, Vol 55, No. 3, March 2006.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE