簡易檢索 / 詳目顯示

研究生: 羅祥維
Lo, Hsiang-Wei
論文名稱: 在晶片網路下的追蹤驅動模擬之因果感知中介表示法
Causality-aware Intermediate Representation for Trace-driven Simulation of Network-on-Chip
指導教授: 金仲達
King, Chung-Ta
口試委員: 徐慰中
Hsu, Wei-Chung
金仲達
King, Chung-Ta
劉靖家
Liou, Jing-Jia
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 39
中文關鍵詞: 因果關係中介表示追蹤晶片網路追蹤驅動模擬
外文關鍵詞: Causality, Intermediate Representation, Trace, NOC, Trace-driven simulation
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • As the number of cores on a chip increases, the design of network-on-chip (NOC) is becoming critical and challenging. A common practice to exploit the design space and evaluate the performance of NOC is to perform trace-driven simulation. Trace-driven simulators are relatively easy to develop and run faster, which allow fast evaluations of different design options. Unfortunately, traces are often generated by some specific machines or emulators, and contain limited information, e.g., when a message is injected from where. When applied to evaluating new architectures, these traces might not conform to the characteristics of the new architectures, leading to inaccurate simulation results.

    The primary reason why current traces could not adapt and conform to the architecture under study is because they do not retain causality relationships of events, and thus the semantics of the execution. In this thesis, we propose a new methodology for trace-driven simulation of NOC. The methodology incurs two phases. The first phase generates rich traces, called intermediate representation (IR), which records the execution behavior of target applications by tracking the happen-before relations of computations and communications. In the second phase, the IR is adapted to produce traces that match the architecture to be studied. This in essence synthesizes new and matching NOC traces for the target architecture. Our evaluation on the Tilera’s TILE64 platform shows that the proposed methodology achieves an average 3% error compared to true machines, while current practices would incur 80% to 300% errors compared to ground truth.


    論文摘要
    伴隨著晶片上處理器核心數的增加,晶片網路系統(NOC)的設計變得嚴峻且有挑戰性。我們通常使用追蹤驅動模擬的方法來設計硬體和測量整體效能。追蹤驅動模擬器對於硬體設計考量上提供比較快的模擬速度,而且也比較容易去發展。但是追蹤紀錄(trace)通常是在某一台機器或模擬器產生,而且包含有限的資訊像是一個訊息是從哪裡被注入晶片網路系統。當我們把某一台機器或模擬器產生的追蹤紀錄應用到一個新的硬體架構,這些追蹤紀錄也許不能確保新的硬體平台的特性,進而導致不精準的模擬結果。
    現今的追蹤紀錄為什麼不能適應和確保硬體架構的特性,最主要的理由是不能確保程式執行時事件間的因果關係。在這篇論文裡,我們對於追蹤驅動模擬提出了一個新的方法。這個新的方法主要包含兩個階段。第一個階段產生豐富的追蹤紀錄,我們稱之為中介表示(IR)。這個中介表示利用追蹤處理器運算和通訊的因果關係,紀錄下目標程式執行的程式行為。在第二個階段我們利用中介表示根據硬體架構去產生對應的追蹤紀錄。這個方法有效的針對目標硬體架構合成出對應的晶片網路追蹤紀錄。我們的實驗在Tileras TILE64平台實作,根據實驗的結果我們的方法與真正的機器跑出來的結果只有3%的相對誤差,目前的方法跟真正的機器跑出來的結果相比卻有80~300%的誤差。

    Contents 1 Introduction 1 1.1 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Benchmark Intermediate Representation . . . . . . . . . . . . . . . . . . . . 5 2 Problem Formulation 6 2.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 General Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Narrow Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Design Considerations 11 3.1 Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.1 Program Behaviors of Parallel Programs . . . . . . . . . . . . . . . . 11 3.1.2 Happen-before Relations between Program Behaviors . . . . . . . . . 12 3.2 IR Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.1 IR Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.2 Happen-before Relations between IR Events . . . . . . . . . . . . . . 15 3.3 The Relations between Trace and IR . . . . . . . . . . . . . . . . . . . . . . 16 4 Experimental Evaluation 18 4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.1 IR Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.2 Relationships between Injection Rate and Scalar . . . . . . . . . . . . 22 4.4.3 IR Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.4.3.1 Link bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.4.3.2 Frequency of processor element . . . . . . . . . . . . . . . . 26 4.4.3.3 Total execution time . . . . . . . . . . . . . . . . . . . . . . 29 5 Related Works 33 6 Conclusion 35

    Bibliography
    [1] Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash, Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, and Pat Hanrahan, “Larrabee: A many-core x86 architecture for visual computing”, ACM Trans. Graph., vol. 27, pp. 18:1–18:15, August 2008.
    [2] Shekhar Borkar, “Thousand core chips: A technology perspective”, in Proceedings of the 44th Annual Design Automation Conference, New York, NY, USA, 2007, DAC ’07, pp. 746–749, ACM.
    [3] S. Kumar, A. Jantsch, J.-P. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani, “A network on chip architecture and design methodology”, in Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Washington, DC, USA, 2002, pp. 117–, IEEE Computer Society.
    [4] S. Bell, B. Edwards, J. Amann, R. Conlin, K. Joyce, V. Leung, J. MacKay, M. Reif, Liewei Bao, J. Brown, M. Mattina, Chyi-Chang Miao, C. Ramey, D. Wentzlaff, W. Anderson,
    E. Berger, N. Fairbanks, D. Khan, F. Montenegro, J. Stickney, and J. Zook,“Tile64 - processor: A 64-core soc with mesh interconnect”, in Proceedings of Interna-
    tion Conference on Solid-State Circuits, 2008. ISSCC 2008., feb. 2008, pp. 88 –598.
    [5] Peter S. Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav H˚allberg, Johan H‥ogberg, Fredrik Larsson, Andreas Moestedt, and BengtWerner, “Simics:
    A full system simulation platform”, Computer, vol. 35, pp. 50–58, February 2002.
    [6] N. Agarwal, T. Krishna, Li-Shiuan Peh, and N.K. Jha, “Garnet: A detailed on-chip network model inside a full-system simulator”, in Proceedings of IEEE International
    Symposium on Performance Analysis of Systems and Software, 2009. ISPASS 2009., April 2009, pp. 33 –42.
    [7] R. Brown, “Calendar queues: A fast 0(1) priority queue implementation for the simulation event set problem”, Commun. ACM, vol. 31, pp. 1220–1227, October 1988.
    [8] JongSuk Ahn and SeungHyun Oh, “Dynamic calendar queue”, in Proceedings of the Thirty-Second Annual Simulation Symposium, Washington, DC, USA, 1999, pp. 20–25,
    IEEE Computer Society.
    [9] Tsuyoshi Isshiki, Dongju Li, Hiroaki Kunieda, Toshio Isomura, and Kazuo Satou, “Trace-driven workload simulation method for multiprocessor system-on-chips”, in Pro-
    ceedings of the 46th Annual Design Automation Conference, New York, NY, USA, 2009, pp. 232–237, ACM.
    [10] Andrew B. Kahng, Bill Lin, Kambiz Samadi, and Rohit Sunkam Ramanujam, “Trace driven optimization of networks-on-chip configurations”, in Proceedings of the 47th
    Design Automation Conference, New York, NY, USA, 2010, pp. 437–442, ACM.
    [11] Joel Hestness, Boris Grot, and Stephen W. Keckler, “Netrace: Dependency-driven tracebased
    network-on-chip simulation”, in Proceedings of the Third International Workshop on Network on Chip Architectures, New York, NY, USA, 2010, pp. 31–36, ACM.
    [12] Derek R. Hower and Mark D. Hill, “Rerun: Exploiting episodes for lightweight memory race recording”, SIGARCH Comput. Archit. News, vol. 36, pp. 265–276, June 2008.
    [13] Leslie Lamport, “Ti clocks, and the ordering of events in a distributed system”, Commun. ACM, vol. 21, pp. 558–565, July 1978.
    [14] R.P. Dick, D.L. Rhodes, and W.Wolf, “Tgff: Task graphs for free”, in Proceedings of the Sixth International Workshop on Hardware/Software Codesign, 1998. (CODES/CASHE
    '98), mar 1998, pp. 97 –101.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE