簡易檢索 / 詳目顯示

研究生: 游凡緯
Yu, Fan-Wei
論文名稱: 針對確定性多核心指令集模擬之關鍵區間層級的時間同步方法
A Critical-Section-Level Timing Synchronization Approach for Deterministic Multi-Core Instruction-Set Simulations
指導教授: 蔡仁松
李哲榮
口試委員: 李昆忠
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 41
中文關鍵詞: 確定性多核心指令集模擬時間同步
外文關鍵詞: Deterministic, Multi-core instruction-set simulation, Timing Synchronization
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文提出了一個針對確定性多核心指令集模擬之關鍵區域層級的時間同步方法,透過同步在每一個鎖的存取,而不是在每一個共享變數的存取,並用一個簡單的鎖使用狀態管理機制,我們的方法可以顯著提高模擬效能,同時使所有關鍵區間執行在一個確定的順序。實驗顯示,我們的方法比起共享變數的同步方法平均快了295%,而我們的方法可以有效地促進系統層級的軟體/硬體協同模擬。


    This thesis proposes a Critical-Section-Level timing synchronization approach for deterministic Multi-Core Instruction-Set Simulation (MCISS). By synchronizing at each lock access instead of every shared-variable access and with a simple lock usage status managing scheme, our approach significantly improves simulation performance while having all critical sections executed in a deterministic order. Experiments show that our approach performs in average 295% faster than the shared-variable synchronization approach and the approach can effectively facilitate system-level software/hardware co-simulation.

    List of Tables 6 List of Figures 7 Chapter 1 Introduction 9 Chapter 2 Related Work 16 Chapter 3 Critical-Section-Level Timing Synchronization 20 3.1. Identifying Lock Sync Points 22 3.2. Spin-waiting Optimization 25 3.3. Lock Activation 27 3.4. The Critical-Section-Level Simulations 29 Chapter 4 Experimental Results 33 Chapter 5 Conclusion 36 Bibliography 37

    [1] F. Bellard, “QEMU, a fast and portable dynamic translator,” in Proceedings of the UNENIX Annual Technical Conference, 2005, pp. 41–41.
    [2] D. Burger and T. M. Austin, “The SimpleScalar tool set, version 2.0,” SIGARCH Comput. Archit. News, vol. 25, no. 3, pp. 13–25, Jun. 1997.
    [3] P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner, “Simics: A full system simulation platform,” Computer, vol. 35, no. 2, pp. 50–58, Feb. 2002.
    [4] M.-H. Wu, C.-Y. Fu, P.-C. Wang, and R.-S. Tsay, “An effective synchronization approach for fast and accurate multi-core instruction-set simulation,” in EMSOFT ’09: Proceedings of the seventh ACM international conference on Embedded software, 2009, p. 197.
    [5] M.-H. Wu, P.-C. Wang, C.-Y. Fu, and R.-S. Tsay, “A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation,” in 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC), 2011, pp. 339–344.
    [6] Z. Wang, R. Liu, Y. Chen, X. Wu, H. Chen, W. Zhang, and B. Zang, “COREMU: a scalable and portable parallel full-system emulator,” in Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, New York, NY, USA, 2011, pp. 213–222.
    [7] M.-H. Wu, W.-C. Lee, C.-Y. Chuang, and R.-S. Tsay, “Automatic generation of software TLM in multiple abstraction layers for efficient HW/SW co-simulation,” in Proceedings of the Conference on Design, Automation and Test in Europe, 3001 Leuven, Belgium, Belgium, 2010, pp. 1177–1182.
    [8] R. Lantz, “Parallel SimOS - Performance and Scalability for Large System Simulation,” PhD Thesis, Standford University, 2007.
    [9] A. Nohl, G. Braun, O. Schliebusch, R. Leupers, H. Meyr, and A. Hoffmann, “A universal technique for fast and flexible instruction-set architecture simulation,” in Proceedings of the 39th annual Design Automation Conference, New York, NY, USA, 2002, pp. 22–27.
    [10] M. Reshadi, P. Mishra, and N. Dutt, “Instruction set compiled simulation: a technique for fast and flexible instruction set simulation,” in Design Automation Conference, 2003. Proceedings, 2003, pp. 758– 763.
    [11] J. Schnerr, O. Bringmann, and W. Rosenstiel, “Cycle Accurate Binary Translation for Simulation Acceleration in Rapid Prototyping of SoCs,” in Proceedings of the conference on Design, Automation and Test in Europe - Volume 2, 2005, pp. 792–797.
    [12] M. Gligor, N. Fournel, and F. Pétrot, “Using binary translation in event driven simulation for fast and flexible MPSoC simulation,” in Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, Grenoble, France, 2009, pp. 71–80.
    [13] J. E. Miller, H. Kasture, G. Kurian, C. Gruenwald, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal, “Graphite: A distributed parallel simulator for multicores,” in 2010 IEEE 16th International Symposium on High Performance Computer Architecture (HPCA), 2010, pp. 1–12.
    [14] M. Olszewski, J. Ansel, and S. Amarasinghe, “Kendo: efficient deterministic multithreading in software,” in Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2009, pp. 97–108.
    [15] J. Devietti, B. Lucia, L. Ceze, and M. Oskin, “DMP: deterministic shared memory multiprocessing,” in Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2009, pp. 85–96.
    [16] M. T. Yourst, “PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator,” in Performance Analysis of Systems & Software, 2007. ISPASS 2007. IEEE International Symposium on, 2007, pp. 23–34.
    [17] Intel Corp., “Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2A: Instruction Set Reference, A-M.” 15-Dec-2007.
    [18] E. Jensen, G. Hagense, and J. Broughton, “A New Approach to Exclusive Data Access in Shared Memory Multiprocessors,” Lawrence Livermore National Laboratory, Technical Report UCRL-97663, Nov. 1987.
    [19] D. Bovet and M. Cesati, Understanding the Linux Kernel, Second Edition, 2nd ed. Sebastopol, CA, USA: O’Reilly & Associates, Inc., 2002.
    [20] Andes Technology Corp., “AndeStar instruction set architecture manual/Andes programming guide.” 2008.
    [21] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 programs: characterization and methodological considerations,” SIGARCH Comput. Archit. News, vol. 23, no. 2, pp. 24–36, 1995.
    [22] M. Xu, R. Bodik, and M. D. Hill, “A ‘flight data recorder’ for enabling full-system multiprocessor deterministic replay,” in Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on, 2003, pp. 122 – 133.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE