簡易檢索 / 詳目顯示

研究生: 林子翔
Lin, Tzu Hsiang
論文名稱: 行動嵌入式多核心系統上的OpenVX框架排程最佳化
Scheduling Methods for OpenVX Programs on Mobile Multi-Core Systems
指導教授: 李政崑
Lee, Jenq-Kuen
口試委員: 陳呈瑋
黃思皓
學位類別: 碩士
Master
系所名稱:
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 41
中文關鍵詞: OpenVX排程節點粗化行動嵌入式系統
外文關鍵詞: OpenVX, scheduling, coarsen, mobile embedded systems
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,新型的行動嵌入式裝置使用異質多核心架構,在有限的能源下來達到效能的提升,在這樣的系統撰寫程式,OpenVX提供了一套用於電腦視覺處理的標準框架,這個標準框架使用以圖為基礎的執行模型來描述運算行為與資料流的關係,每一個圖中的運算節點可以被分派到不同的運算裝置上完成運算,例如: 循序運行的C或並行處理的OpenMP執行時期在多核心中央處理器(CPU)上運算、平行語言OpenCL計算於圖形處理器(GPU)、遠端程序呼叫至數位信號處理器(DSP),抑或是以專用硬體完成運算。因此,如何有效率地將所有運算節點安排到這些不同的運算裝置上帶來了最佳化的研究議題。
    在本論文裡,我們提出了一個考慮記憶體區域性與系統處理能力的OpenVX圖排程方法,這是一個兩階段的排程方法將運算節點分派到不同的運算裝置上,在第一階段為節點粗化操作,將符合條件的節點圈選為群組,接著於第二階段進行排程,考量運算節點特性將其分派到適宜的運算裝置,我們在高通的Dragon Board 810開發板上進行實驗,結果顯示我們提出的兩階段排程方法,可以有效的在異質多核心環境下完成OpenVX程式的排程。


    Modern mobile embedded systems use heterogeneous multi-core architectures to achieve performance improvement under an energy constraint. To program such systems, OpenVX promises to provide a standard programming framework for computer vision processing. OpenVX is with a graph-based execution model to describe the computation behavior and data flow relationship. Each computation node in the graph can be dispatched to a different target, such as multicore CPUs with C and OpenMP runtime, OpenCL on GPUs, remote procedure call to DSP, or even a dedicated hardware. Therefore, how to efficiently schedule all the computation nodes to those
    different targets opens up the optimization opportunities.
    In this thesis, we propose a method to schedule OpenVX task graph by considering both memory locality and system throughput. The proposed two phase scheduling method first performs coarsen schemes to cluster nodes together, and then in the second phase a scheduling method is employed to schedule nodes into different targets. The experimental result of our experiments on Qualcomm DragonBoard 810 development board shows that our scheme works well in scheduling OpenVX programs on heterogeneous multi-core environments.

    Abstract i Contents ii List of Figures iv List of Tables v 1 Introduction 1 1.1 Introduction. . . . . . . . . . . 1 1.2 Overview of the Thesis. . . . . . . . . 2 2 Background 4 2.1 OpenVX Programming . . . . . . . . . 4 2.2 OpenCL Runtime Flow . . . . . . . . . 8 2.3 FastRPC mechanism of Hexagon DSP . . . . 9 3 Coarsen-Scheduling algorithm 12 3.1 Node coarsen algorithm . . . . . . . . 12 3.1.1 The clusterCP procedure . . . . . . . 14 3.1.2 The clusterG procedure . . . . . . . 16 3.1.3 Node Coarsen example . . . . . . . . 17 3.2 Node scheduling algorithm . . . . . . . 19 3.2.1 Node Scheduling example . . . . . . . 21 4 Experimental Results 22 4.1 Experimental Environment . . . . . . . . 22 4.2 Experiments . . . . . . . . . . . . . 23 4.2.1 Experimental Design . . . . . . . . 24 4.2.2 Experimental Results . . . . . . . . 26 5 Conclusion 29 5.1 Summary . . . . . . . . . . . . . 29 5.2 Future Work . . . . . . . . . . . . 29

    [1] The Khronos Group, “Openvx - portable, power-efficient vision processing,” 2016. [Online]. Available: https://www.khronos.org/openvx/
    [2] OpenMP, “The openmp api specification for parallel programming,” 2016. [Online]. Available: http://openmp.org/wp/
    [3] The Khronos Group, “Opencl, the open standard for parallel programming of heterogeneous systems,” 2016. [Online]. Available:
    https://www.khronos.org/opencl/
    [4] Qualcomm Technologies, Inc, “Hexagon dsp sdk,” 2016. [Online].
    Available: https://developer.qualcomm.com/software/hexagon-dsp-sdk
    [5] H. Topcuoglu, S. Hariri, and M.-Y. Wu, “Performance-effective and low complexity task scheduling for heterogeneous computing,” Parallel and Distributed Systems, IEEE Transactions on, vol. 13, no. 3, pp. 260–274,
    Mar 2002.
    [6] J. chiou Liou and M. A. Palis, “An efficient task clustering heuristic for scheduling dags on multiprocessors,” in Multiprocessors Workshop on
    Resource Management, Symposium of Parallel and Distributed Processing, 1996, pp. 152–156.
    [7] C. Chen, Y. Chang, Y. Chen, C. Yang, and J. K. Lee, “Switching supports for stateful object remoting on network processors,” The Journal of Supercomputing, vol. 40, no. 3, pp. 281–298, 2007. [Online].
    Available: http://dx.doi.org/10.1007/s11227-006-0023-2
    [8] Y. Wen, Z. Wang, and M. F. P. O. Boyle, “Smart multi-task scheduling for opencl programs on cpu/gpu heterogeneous platforms,” in 2014 21st International Conference on High Performance Computing (HiPC), Dec 2014, pp. 1–10.
    [9] E. Rainey, J. Villarreal, G. Dedeoglu, K. Pulli, T. Lepley, and F. Brill, “Addressing system-level optimization with openvx graphs,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on, June 2014, pp. 658–663.
    [10] X. Li, Y. Hu, X. Gao, D. Tao, and B. Ning, “A multi-frame image super-resolution method,” Signal Processing,
    vol. 90, no. 2, pp. 405–414, 2010. [Online]. Available:
    http://www.sciencedirect.com/science/article/pii/S0165168409002618
    [11] P. Viola and M. J. Jones, “Robust real-time face detection,” Int. J.Comput. Vision, vol. 57, no. 2, pp. 137–154, May 2004. [Online]. Available: http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb
    [12] Qualcomm Technologies, Inc, “Fastcv computer vision sdk,” 2016. [Online]. Available: https://developer.qualcomm.com/software/fastcv-sdk

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE