簡易檢索 / 詳目顯示

研究生: 簡佑軒
Chien, Yu Hsuan
論文名稱: 在異質系統架構的環境上以機器學習的方式分配Aparapi程式
A Machine Learning Approach to Partitioning Aparapi Programs on HSA Environments
指導教授: 李政崑
Lee, Jenq Kuen
口試委員: 陳鵬升
Chen, Peng Sheng
黃冠寰
Hwang, Gwan Hwan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 34
中文關鍵詞: AparapiJava機器學習異質系統架構程式分配
外文關鍵詞: program partitioning
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Java 是現在最多人使用的程式語言,從近年來蓬勃發展的智慧型行動裝置,如筆記型電腦、智慧型手機等,軟體開發皆須使用 Java 語言。隨著使用者對多媒體的效能需求逐之提高,GPU在智慧型手機以及平板中已成為必備規格,如何讓Java 語言有效的使用GPU變成為重要的議題。
    去年 AMD 提出全新的硬體架構 HSA 之後,少了 CPU/GPU 資料搬移,異質多核心系統在程式的執行速度方面提升了一大步,因此如果將 Java 語言有效的運用在 HSA上的話,在提高 Java 運算效能方面將會有很大的幫助。
    現階段有Aparapi和Sumatra這兩個工具可以讓 Java程式在HSA系統上執行。本論文在Aparapi上提出一個分配Java程式的優化,我們分析由Java程式在Aparapi產生出來的hsail,觀察該hsail各個instructions的數量,收集並整理出幾個和CPU/GPU執行效率相關的因素,將之集合起來並進行資料訓練以產生出一個 machine-learning的模型,該模型能針對Aparapi產生出的hsail進行預測以得知該Java程式應該由CPU還是GPU進行運算,藉此增加程式的效能。


    Java is one of the most popular programming languages. It can be used to develop mobile devices such as laptop or smart phone. As the need of performance increasing, GPUs now are equipped with in most smart phones and tablets. Therefore, how Java programs can perform more efficiently becomes an important issue.Recently, AMD released a brand new architecture, Heterogeneous System Architecture (HSA), which reduces CPU or GPU data movement and increases the performance of heterogeneous systems. Therefore, if we can let Java programs run on HSA efficiently, Java computing performance can be increased.

    In this thesis, we propose a program partitioning technique for Aparapi.
    Aparapi is developed by AMD and is with flow to translate to HSAIL to call GPU to calculate it. We implement a profiling skill on Aparapi runtime that can extract factors and generate a machine learning model. With the machine learning model, we can predict whether CPU or GPU is suitable for the following Aparapi programs and shorten the total execution time. As a result, the partitioning technique of Aparapi programs make Java programs get better performance than those without this technique on heterogeneous system architectures.

    Abstract i Contents ii List of Figures iv List of Tables v 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . 3 2 Background with APARAPI Programs 6 2.1 Guide Programming of Aparapi . . . . . . . . . . . . . . . . . 7 2.2 The Aparapi Translation Flow . . . . . . . . . . . . . . . . . . 9 3 Motivation 12 4 Partitioning Aparapi Programs 14 4.1 Aparapi Code Feature Extraction . . . . . . . . . . . . . . . . 15 4.1.1 Introduction of Machine-Learning Factors . . . . . . . 15 4.1.2 Implement Vector Instruction . . . . . . . . . . . . . . 19 4.1.3 Collecting Factors . . . . . . . . . . . . . . . . . . . . . 21 4.2 Building Machine-Learning Based Model . . . . . . . . . . . . 22 4.3 Introduction of Support Vector Machines . . . . . . . . . . . . 23 5 Experiment Results 25 5.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . 25 5.2 Experimental Data Format for Machine-Learning . . . . . . . 26 5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 27 6 Conclusion 31 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    [1] HSA Programmers Reference Manual 0-1.95 1 May 2013, HSA Founda-
    tion, 2013.
    [2] Aparapi users guide, AMD, 2011. [Online]. Available:
    https://code.google.com/p/aparapi/wiki/UsersGuide
    [3] The OpenCL Speci cation, Khronos OpenCL Working Group, 2012.
    [Online]. Available: http://www.khronos.org/registry/cl/spec/opencl-
    1.2.pdf
    [4] Aparapi quick reference guide, AMD, 2011.
    [5] M. F. P. O. Dominik Grewe, \A static task partitioning approach for het-
    erogeneous systems using opencl," in CC'11/ETAPS'11 Proceedings of
    the 20th international conference on Compiler construction: part of the
    joint European conferences on theory and practice of software. Springer-
    Verlag Berlin, Heidelberg: ACM, 2011, pp. 286{305.
    [6] M. O. Alberto Magni, Christophe Dubach, \Automatic optimization of
    thread-coarsening for graphics processors," in PACT '14 Proceedings of
    the 23rd international conference on Parallel architectures and compila-
    tion. New York, NY, USA: ACM, 2014, pp. 455{466.
    [7] M. Aldinucci, M. Danelutto, and P. Teti, \An advanced environment
    supporting structured parallel programming in java," Future Gener.
    Comput. Syst., vol. 19, no. 5, pp. 611{626, 2003.
    [8] E. Tilevich and Y. Smaragdakis., \J-orchestra: Automatic java applica-
    tion partitioning," ECOOP, 2002.
    [9] T. Chi-Keung Luk, Sunpyo Hong and H. H. Kim, \Qilin: Exploiting
    parallelism on heterogeneous multiprocessors with adaptive mapping,"
    2009.
    [10] B. R. Curt Albert, Alastair Murray, \Applying source level auto-
    vectorization to aparapi java," in PPPJ '14 Proceedings of the 2014
    International Conference on Principles and Practices of Programming
    on the Java platform: Virtual machines, Languages, and Tools. New
    York, NY, USA: ACM, 2014, pp. 122{132.
    [11] (2010) Libsvm { a library for support vector machines. Machine
    Learning and Data Mining Group in National Taiwan University.
    [Online]. Available: https://www.csie.ntu.edu.tw/ cjlin/libsvm/
    [12] Aparapi cpu gpu benchmarks, vortex, 2011. [Online]. Available:
    http://aparapi-vortex.blogspot.tw/
    [13] M. O. Alberto Magni, Christophe Dubach, \Performance analysis
    between aparapi (a parallel api) and java by implementing sobel
    edge detection algorithm," in Parallel Computing Technologies (PAR-
    COMPTECH), 2013 National Conference on. New York, NY, USA:
    IEEE, pp. 1{5.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE