簡易檢索 / 詳目顯示

研究生: 戴宏穎
Tai, Hung-Ying
論文名稱: Crystal:大型系統軟體效能分析與優化框架
Crystal: an extensible framework for profiling and optimizing large scale system software
指導教授: 李哲榮
Lee, Che-Rung
口試委員: 鍾葉青
Chung, Yeh-Ching
徐慰中
Hsu, Wei-Chung
學位類別: 碩士
Master
系所名稱:
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 30
中文關鍵詞: 編譯器圖計算回饋導向優化系統軟體
外文關鍵詞: Compiler, Graph Computing, Feedback-directed Optimization, System Software
相關次數: 點閱:74下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著軟體開發的演進,現代系統軟體的規模變得複雜,擁有動輒數百萬行程式碼已是常態。伴隨程式碼日漸龐大,對其進行全面的效能分析與優化所需要工具與技巧難度也越來越高。
    因此,本論文提出並實作一套具備可擴充性的框架,整合編譯器與多個效能分析工具,替使用者收集並儲存編譯器編譯時期的靜態分析資料、效能分析工具的動態分析資料等,而使用者則只需繼承與開發特定的模組就能取得這些資料來進行分析與優化。Crystal 採用效能分析導向的優化方式(profile-guided optimization)。
    為了驗證收集的分析資料是正確且有用的,我們提出一個應用範例,將效能分析工具所收集的資料重新建構成權重函式呼叫圖(Weighted Function Call Graph),並套用圖計算中的社群分佈的演算法,將所有函式根據 Louvain 社群偵測演算法進行分群,接著再對每個群組內的函式根據 Pettis-Hansen 演算法重新擺放順序與位置,以達到更佳的效能。在實驗中,我們以 LLVM 作為目標,選取來自 LLVM 專案中的二十個程式進行編譯測試,最終我們得到執行效能 4.5% 到 12.3% 間的提升。


    In recent years, system software has become larger and more complex. There are several challenges to integrate profile data collection, analysis algorithm implementation, and optimal optimization for large scale program. Therefore, we present an extensible framework called Crystal, which is designed for wrapping multiple profilers, compilers, and analyzers to gather useful data including static and dynamic function call graph, function counts, CPU cycles, cache miss rate, etc. Besides, we combine these data into a weighted function call graph. With this framework, user can simply apply analysis and optimization algorithm to program with les worry about the complexity of profile data generated by various tools.
    In this paper, we focus on framework design for wrapping various output formats from different tools. Our framework provides compiler unit for translating source code to IR, optimizer unit for applying profile-guided analysis results on IR, profiler unit for gathering accurate data, analyzer unit for determining optimization strategies and datastore unit for storing archived data.
    Finally, we have an application example on our framework. We propose a methodology to change code layout replacement in two steps. First, build weighted function call graph based on profiling data and we apply a community detection algorithm called Louvain method to split entire graph into multiple partitions. Second, we apply classic Pettis-Hansen algorithm in each partition to reordering function. In our evaluation, choosing LLVM as target and design 20 test cases for measuring the performance. In consequence, it increases from 4.5% to 12.4% performance improvement better than O0.

    Chapter 1 Introduction 5 Chapter 2 Design of Crystal Framework 7 2.1 Overview 7 2.2 Examples 8 Chapter 3 Implementation 10 3.1 System Architecture 10 3.2 Compiler Unit 11 3.3 Optimizer Unit 12 3.4 Profiler Unit 13 3.4.1 Client-Server Architecture 13 3.4.2 Building Engine 14 3.4.3 Profiling Engine 15 3.5 Analyzer Unit Engine 17 3.6 Datastore Unit 17 Chapter 4 Evaluation 19 4.1 Benchmark Details 20 4.2 System Evaluation 22 4.3 Results 22 Chapter 5 Related work 25 Chapter 6 Conclusions and Future Work 26 REFERENCES 27

    [1] D. Chen, D. X. Li, and T. Moseley, “AutoFDO: Automatic feedback-directed
    optimization for warehouse-scale applications,” in Proceedings of the
    International Symposium on Code Generation and Optimization, pp. 12–23,
    2016.
    [2] K. Pettis and R. C. Hansen, “Profile guided code positioning,” in Proceedings
    of the ACM Conference on Programming Language Design and
    Implementation, pp. 16–27, 1990
    [3] E. Petrank and D. Rawitz, “The hardness of cache conscious data placement,”
    in Proceedings of the ACM Symposium on Principles of Programming
    Languages, pp. 101–112, 2002.
    [4] Meng-Hsun, Yang, “Code Layout Optimization Applying Community
    Detection (Unpublished master's thesis)”, National Tsing Hua University,
    Taiwan, R.O.C, 2017
    [5] C. Lattner and V. Adve, “Architecture for a NextGeneration GCC”,
    Proceedings of the First Annual GCC Developers' Summit, Ottawa, Canada,
    May 2003.
    [6] C. Lattner and V. Adve, “LLVM: A compilation framework for lifelong
    program analysis & transformation,” in Proceedings of the International
    Symposium on Code Generation and Optimization, pp. 75–86, 2004.
    [7] The LLVM compiler infrastructure, website: http://llvm.org
    [8] V.D. Blondel, J.-L. Guillaume, and R. Lambiotte, E. Lefebvre, “Fast
    Unfolding of Communities in Large Networks,” J. Stat. Mech. 2008, P10008

    [9] David Xinliang Li , Raksit Ashok , Robert Hundt, “Lightweight feedback-
    directed cross-module optimization”, Proceedings of the 8th annual
    IEEE/ACM international symposium on Code generation and optimization,
    April 24-28, 2010, Toronto, Ontario, Canada
    [10] Guilherme Ottoni , Bertrand Maher, “Optimizing function placement for large-
    scale data-center applications”, Proceedings of the 2017 International
    Symposium on Code Generation and Optimization, February 04-08, 2017,
    Austin, USA
    [11] Matthew Arnold , Barbara G. Ryder, “A framework for reducing the cost of
    instrumented code”, Proceedings of the ACM SIGPLAN 2001 conference on
    Programming language design and implementation, p.168-179, June 2001,
    Snowbird, Utah, USA
    [12] Shai Rubin , Rastislav Bodík , Trishul Chilimbi, “An efficient profile-analysis
    framework for data-layout optimizations”, ACM SIGPLAN Notices, v.37 n.1,
    p.140-153, Jan. 2002
    [13] P. Berube and J.N. Amaral, “Aestimo: a feedback-directed optimization
    evaluation tool”, Performance Analysis of Systems and Software, IEEE,
    2006.

    [14] Linux Perf, website: https://perf.wiki.kernel.org/.
    [15] Thomas Ball , James R. Larus, Optimally profiling and tracing programs,
    ACM Transactions on Programming Languages and Systems (TOPLAS), v.16
    n.4, p.1319-1360, July 1994
    [16] Diego Novillo, “SamplePGO: the power of profile guided optimizations
    without the usability burden”, LLVM-HPC '14 Proceedings of the 2014
    LLVM Compiler Infrastructure in HPC Pages 22-28, 2014
    [17] D. Chen, N. Vachharajani, R. Hundt, X. Li, S. Eranian, W. Chen, and W.
    Zheng, “Taming hardware event samples for precise and versatile feedback
    directed optimizations,” IEEE Transactions on Computers, vol. 62, no. 2, pp.
    376–389, 2013
    [18] Apache HBase. Website: https://hbase.apache.org
    [19] Apache Hadoop. Website: https://hadoop.apache.org
    [20] QEMU. Website: http://www.qemu.org/
    [21] M. Girvan and M. E. J. Newman, “Community Structure in Social and
    Biological Networks,” in Proceedings of the National Academy of Sciences of
    the United States of America,Vol. 99, No. 12, pp. 7821-7826, 2002.

    QR CODE