簡易檢索 / 詳目顯示

研究生: 朱彥竹
Chu, Yen-Chu.
論文名稱: 在Kubernetes支援零碎GPU資源配置的網路拓撲與異質資源感知工作排程方法
A topology and heterogeneous resource aware scheduler for fractional GPU allocation in Kubernetes cluster
指導教授: 周志遠
Chou, Jerry
口試委員: 李哲榮
Lee, Che-Rung
賴冠州
Lai, Kuan-Chou
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 40
中文關鍵詞: GPU排程Kubernetes
外文關鍵詞: GPU, Scheduler, Kubernetes
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • GPU 透過高度地平行運算,來提供極大的吞吐量,因此GPU 被許多的雲平
    台和應用服務採用。而隨著科技的發展,GPU 的效能迅速地成長,資料中
    心批次採購GPU,使得一個叢集具備多種效能的GPU,GPU 擺放在節點上
    不同的位置也會直接影響一些應用程式的效能,因此如何在叢集上有效地管
    理GPU 資源成為了重要的議題。排程器在其中扮演了重要的角色,需要同
    時考量到資源和應用程式的特性才能夠更好地運用資源。為了克服這些議
    題,我們設計並實作了KubeShare 2.0,透過Kubernetes 的排程框架針對需要
    使用GPU 的工作實現了一個Scheduler,能夠感知GPU 拓撲與異質性以作為
    GPU 分配的依據。我們也利用真實的深度學習應用去印證KubeShare2.0 的
    決策。KubeShare 2.0 提供更好的GPU 配置以支援Pod 之間共享單張GPU。
    當創建Pod 數量增多時,我們的方法大幅度地減少Pod 建立的時間,與使用
    KubeShare 1.0 相比,KubeShare 2.0 能夠有效地減少平均工作完成時間高達
    36%。


    GPU provides high throughput through highly parallel computing, so it is widely
    used by various could platforms and applications. Furthermore, with the rapid development
    of technology, the performance of GPUs has grown rapidly. The data center
    purchases GPUs in batches so that there are GPUs with different performances in
    the cluster. The placement of GPUs in different positions will also directly affect the
    performance of some applications. Therefore, effectively managing GPU resources
    on clusters has become an important issue. The scheduler plays an important role,
    and it is necessary to consider the application’s characteristics and resources to make
    better use of the resources. To overcome these issues, we designed and implemented
    KubeShare 2.0. Through the Kubernetes scheduling framework, we implemented
    a scheduler for workloads that requires GPUs, which can perceive GPU topology
    and heterogeneity as a basis for GPU allocation. We also use real deep learning
    workloads to confirm the decision of KubeShare 2.0. KubeShare 2.0 provides better
    GPU allocation to support sharing a single GPU between Pods. When the number
    of pods created increases, our scheduler significantly reduces the time for Pod creation.
    Compared with KubeShare 1.0[24], KubeShare 2.0 can effectively reduce
    the average job completion time by up to 36%.

    1 Introduction 1 2 Background 4 2.1 Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Kubernetes Custom Scheduler . . . . . . . . . . . . . . . . . . . . 6 2.3 Kubernetes Device Plugin . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Gemini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5 KubeShare 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 KubeShare 2.0 10 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Cluster Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Schedule a Pod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 Fractional GPU allocation and binding . . . . . . . . . . . . . . . . 16 3.5 Scheduling Decision . . . . . . . . . . . . . . . . . . . . . . . . . 18 4 Evaluation 24 4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 Topology Awareness . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Resource Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . 28 4.4 Job Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.5 Fractional GPU allocation and Binding . . . . . . . . . . . . . . . . 32 5 Related Work 34 6 Conclusions and Future works 36 References 38

    [1] Kubernetes Device plugin. https://github.com/kubernetes/
    design-proposals-archive/blob/main/resource-management/
    device-plugin.md.
    [2] NVIDIA container runtime. https://github.com/NVIDIA/
    nvidia-container-runtime.
    [3] NVIDIA Device plugin. https://github.com/NVIDIA/k8s-device-plugin.
    [4] registry of kube-scheduler. https://github.com/kubernetes/kubernetes/
    blob/v1.18.10/pkg/scheduler/algorithmprovider/registry.go.
    [5] Chen, H.-H., Lin, E.-T., Chou, Y.-M., and Chou, J. Gemini: Enabling multitenant
    gpu sharing based on kernel burst estimation. IEEE Transactions on
    Cloud Computing (2021).
    [6] Gu, J., Song, S., Li, Y., and Luo, H. Gaiagpu: sharing gpus in container clouds.
    In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications,
    Ubiquitous Computing & Communications, Big Data & Cloud Computing,
    Social Computing & Networking, Sustainable Computing & Communications
    (ISPA/IUCC/BDCloud/SocialCom/SustainCom) (2018), IEEE, pp. 469–
    476.
    [7] He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image
    recognition. In Proceedings of the IEEE conference on computer vision and
    pattern recognition (2016), pp. 770–778.
    [8] Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation
    9, 8 (1997), 1735–1780.
    [9] Le, Y., and Yang, X. Tiny imagenet visual recognition challenge. CS 231N 7,
    7 (2015), 3.
    [10] Mahajan, K., Balasubramanian, A., Singhvi, A., Venkataraman, S., Akella,
    A., Phanishayee, A., and Chawla, S. Themis: Fair and efficient {GPU} cluster
    scheduling. In 17th USENIX Symposium on Networked Systems Design and
    Implementation (NSDI 20) (2020), pp. 289–304.
    [11] Martin, D., Fowlkes, C., Tal, D., and Malik, J. A database of human segmented
    natural images and its application to evaluating segmentation algorithms and
    measuring ecological statistics. In Proc. 8th Int’l Conf. Computer Vision (July
    2001), vol. 2, pp. 416–423.
    [12] Merity, S., Xiong, C., Bradbury, J., and Socher, R. Pointer sentinel mixture
    models. arXiv preprint arXiv:1609.07843 (2016).
    [13] Merkel, D., et al. Docker: lightweight linux containers for consistent development
    and deployment. Linux j 239, 2 (2014), 2.
    [14] NVIDIA. NVIDIA GPU Specification. https://www.nvidia.com/zh-tw/
    geforce/graphics-cards/compare.
    [15] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen,
    T., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An imperative style,
    high-performance deep learning library. Advances in neural information processing
    systems 32 (2019).
    [16] Peng, Y., Bao, Y., Chen, Y., Wu, C., and Guo, C. Optimus: an efficient dynamic
    resource scheduler for deep learning clusters. In Proceedings of the Thirteenth
    EuroSys Conference (2018), pp. 1–14.
    [17] Pytorch. TorchElastic. https://github.com/pytorch/elastic.
    [18] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert,
    D., and Wang, Z. Real-time single image and video super-resolution using an
    efficient sub-pixel convolutional neural network. In Proceedings of the IEEE
    conference on computer vision and pattern recognition (2016), pp. 1874–1883.
    [19] Song, S., Deng, L., Gong, J., and Luo, H. Gaia scheduler: A kubernetesbased
    scheduler framework. In 2018 IEEE Intl Conf on Parallel & Distributed
    Processing with Applications, Ubiquitous Computing & Communications, Big
    Data & Cloud Computing, Social Computing & Networking, Sustainable Computing
    & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)
    (2018), IEEE, pp. 252–259.
    [20] Thinakaran, P., Gunasekaran, J. R., Sharma, B., Kandemir, M. T., and Das,
    C. R. Kube-knots: Resource harvesting through dynamic container orchestration
    in gpu-based datacenters. In 2019 IEEE International Conference on
    Cluster Computing (CLUSTER) (2019), IEEE, pp. 1–13.
    [21] VMware. The State of Kubernetes 2021. https://tanzu.vmware.com/
    content/ebooks/the-state-of-kubernetes-2021.
    [22] Xiao, W., Bhardwaj, R., Ramjee, R., Sivathanu, M., Kwatra, N., Han, Z., Patel,
    P., Peng, X., Zhao, H., Zhang, Q., et al. Gandiva: Introspective cluster
    scheduling for deep learning. In 13th USENIX Symposium on Operating Systems
    Design and Implementation (OSDI 18) (2018), pp. 595–610.
    [23] Xiao, W., Ren, S., Li, Y., Zhang, Y., Hou, P., Li, Z., Feng, Y., Lin, W., and
    Jia, Y. {AntMan}: Dynamic scaling on {GPU} clusters for deep learning. In
    14th USENIX Symposium on Operating Systems Design and Implementation
    (OSDI 20) (2020), pp. 533–548.
    [24] Yeh, T.-A., Chen, H.-H., and Chou, J. Kubeshare: A framework to manage
    gpus as first-class and shared resources in container cloud. In Proceedings
    of the 29th International Symposium on High-Performance Parallel and Distributed
    Computing (2020), pp. 173–184.
    [25] Zhao, H., Han, Z., Yang, Z., Zhang, Q., Yang, F., Zhou, L., Yang, M., Lau,
    F. C., Wang, Y., Xiong, Y., et al. {HiveD}: Sharing a {GPU} cluster for deep
    learning with guarantees. In 14th USENIX Symposium on Operating Systems
    Design and Implementation (OSDI 20) (2020), pp. 515–532.
    [26] Zhu, X., Gong, L., Zhu, Z., and Zhou, X. Vapor: A gpu sharing scheduler
    with communication and computation pipeline for distributed deep learning. In
    2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications,
    Big Data & Cloud Computing, Sustainable Computing & Communications,
    Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
    (2021), IEEE, pp. 108–116.

    QR CODE