在Kubernetes支援零碎GPU資源配置的網路拓撲與異質資源感知工作排程方法

簡易檢索 / 詳目顯示

回結果列表

研究生：	朱彥竹 Chu, Yen-Chu.
論文名稱：	在Kubernetes支援零碎GPU資源配置的網路拓撲與異質資源感知工作排程方法 A topology and heterogeneous resource aware scheduler for fractional GPU allocation in Kubernetes cluster
指導教授：	周志遠 Chou, Jerry
口試委員:	李哲榮 Lee, Che-Rung 賴冠州 Lai, Kuan-Chou
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	40
中文關鍵詞：	GPU 、排程、Kubernetes
外文關鍵詞：	GPU, Scheduler, Kubernetes
相關次數：	點閱：104 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

GPU 透過高度地平行運算，來提供極大的吞吐量，因此GPU 被許多的雲平
台和應用服務採用。而隨著科技的發展，GPU 的效能迅速地成長，資料中
心批次採購GPU，使得一個叢集具備多種效能的GPU，GPU 擺放在節點上
不同的位置也會直接影響一些應用程式的效能，因此如何在叢集上有效地管
理GPU 資源成為了重要的議題。排程器在其中扮演了重要的角色，需要同
時考量到資源和應用程式的特性才能夠更好地運用資源。為了克服這些議
題，我們設計並實作了KubeShare 2.0，透過Kubernetes 的排程框架針對需要
使用GPU 的工作實現了一個Scheduler，能夠感知GPU 拓撲與異質性以作為
GPU 分配的依據。我們也利用真實的深度學習應用去印證KubeShare2.0 的
決策。KubeShare 2.0 提供更好的GPU 配置以支援Pod 之間共享單張GPU。
當創建Pod 數量增多時，我們的方法大幅度地減少Pod 建立的時間，與使用
KubeShare 1.0 相比，KubeShare 2.0 能夠有效地減少平均工作完成時間高達
36%。

GPU provides high throughput through highly parallel computing, so it is widely
used by various could platforms and applications. Furthermore, with the rapid development
of technology, the performance of GPUs has grown rapidly. The data center
purchases GPUs in batches so that there are GPUs with different performances in
the cluster. The placement of GPUs in different positions will also directly affect the
performance of some applications. Therefore, effectively managing GPU resources
on clusters has become an important issue. The scheduler plays an important role,
and it is necessary to consider the application’s characteristics and resources to make
better use of the resources. To overcome these issues, we designed and implemented
KubeShare 2.0. Through the Kubernetes scheduling framework, we implemented
a scheduler for workloads that requires GPUs, which can perceive GPU topology
and heterogeneity as a basis for GPU allocation. We also use real deep learning
workloads to confirm the decision of KubeShare 2.0. KubeShare 2.0 provides better
GPU allocation to support sharing a single GPU between Pods. When the number
of pods created increases, our scheduler significantly reduces the time for Pod creation.
Compared with KubeShare 1.0[24], KubeShare 2.0 can effectively reduce
the average job completion time by up to 36%.

Introduction 1
Background 4
1 Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Kubernetes Custom Scheduler . . . . . . . . . . . . . . . . . . . . 6
3 Kubernetes Device Plugin . . . . . . . . . . . . . . . . . . . . . . 7
4 Gemini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 KubeShare 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
KubeShare 2.0 10
1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Cluster Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Schedule a Pod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Fractional GPU allocation and binding . . . . . . . . . . . . . . . . 16
5 Scheduling Decision . . . . . . . . . . . . . . . . . . . . . . . . . 18
Evaluation 24
1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Topology Awareness . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Resource Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Job Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 Fractional GPU allocation and Binding . . . . . . . . . . . . . . . . 32
Related Work 34
Conclusions and Future works 36
References 38
                                

[1] Kubernetes Device plugin. https://github.com/kubernetes/
design-proposals-archive/blob/main/resource-management/
device-plugin.md.
[2] NVIDIA container runtime. https://github.com/NVIDIA/
nvidia-container-runtime.
[3] NVIDIA Device plugin. https://github.com/NVIDIA/k8s-device-plugin.
[4] registry of kube-scheduler. https://github.com/kubernetes/kubernetes/
blob/v1.18.10/pkg/scheduler/algorithmprovider/registry.go.
[5] Chen, H.-H., Lin, E.-T., Chou, Y.-M., and Chou, J. Gemini: Enabling multitenant
gpu sharing based on kernel burst estimation. IEEE Transactions on
Cloud Computing (2021).
[6] Gu, J., Song, S., Li, Y., and Luo, H. Gaiagpu: sharing gpus in container clouds.
In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications,
Ubiquitous Computing & Communications, Big Data & Cloud Computing,
Social Computing & Networking, Sustainable Computing & Communications
(ISPA/IUCC/BDCloud/SocialCom/SustainCom) (2018), IEEE, pp. 469–
476.
[7] He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer vision and
pattern recognition (2016), pp. 770–778.
[8] Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation
9, 8 (1997), 1735–1780.
[9] Le, Y., and Yang, X. Tiny imagenet visual recognition challenge. CS 231N 7,
7 (2015), 3.
[10] Mahajan, K., Balasubramanian, A., Singhvi, A., Venkataraman, S., Akella,
A., Phanishayee, A., and Chawla, S. Themis: Fair and efficient {GPU} cluster
scheduling. In 17th USENIX Symposium on Networked Systems Design and
Implementation (NSDI 20) (2020), pp. 289–304.
[11] Martin, D., Fowlkes, C., Tal, D., and Malik, J. A database of human segmented
natural images and its application to evaluating segmentation algorithms and
measuring ecological statistics. In Proc. 8th Int’l Conf. Computer Vision (July
2001), vol. 2, pp. 416–423.
[12] Merity, S., Xiong, C., Bradbury, J., and Socher, R. Pointer sentinel mixture
models. arXiv preprint arXiv:1609.07843 (2016).
[13] Merkel, D., et al. Docker: lightweight linux containers for consistent development
and deployment. Linux j 239, 2 (2014), 2.
[14] NVIDIA. NVIDIA GPU Specification. https://www.nvidia.com/zh-tw/
geforce/graphics-cards/compare.
[15] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen,
T., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An imperative style,
high-performance deep learning library. Advances in neural information processing
systems 32 (2019).
[16] Peng, Y., Bao, Y., Chen, Y., Wu, C., and Guo, C. Optimus: an efficient dynamic
resource scheduler for deep learning clusters. In Proceedings of the Thirteenth
EuroSys Conference (2018), pp. 1–14.
[17] Pytorch. TorchElastic. https://github.com/pytorch/elastic.
[18] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert,
D., and Wang, Z. Real-time single image and video super-resolution using an
efficient sub-pixel convolutional neural network. In Proceedings of the IEEE
conference on computer vision and pattern recognition (2016), pp. 1874–1883.
[19] Song, S., Deng, L., Gong, J., and Luo, H. Gaia scheduler: A kubernetesbased
scheduler framework. In 2018 IEEE Intl Conf on Parallel & Distributed
Processing with Applications, Ubiquitous Computing & Communications, Big
Data & Cloud Computing, Social Computing & Networking, Sustainable Computing
& Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)
(2018), IEEE, pp. 252–259.
[20] Thinakaran, P., Gunasekaran, J. R., Sharma, B., Kandemir, M. T., and Das,
C. R. Kube-knots: Resource harvesting through dynamic container orchestration
in gpu-based datacenters. In 2019 IEEE International Conference on
Cluster Computing (CLUSTER) (2019), IEEE, pp. 1–13.
[21] VMware. The State of Kubernetes 2021. https://tanzu.vmware.com/
content/ebooks/the-state-of-kubernetes-2021.
[22] Xiao, W., Bhardwaj, R., Ramjee, R., Sivathanu, M., Kwatra, N., Han, Z., Patel,
P., Peng, X., Zhao, H., Zhang, Q., et al. Gandiva: Introspective cluster
scheduling for deep learning. In 13th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 18) (2018), pp. 595–610.
[23] Xiao, W., Ren, S., Li, Y., Zhang, Y., Hou, P., Li, Z., Feng, Y., Lin, W., and
Jia, Y. {AntMan}: Dynamic scaling on {GPU} clusters for deep learning. In
14th USENIX Symposium on Operating Systems Design and Implementation
(OSDI 20) (2020), pp. 533–548.
[24] Yeh, T.-A., Chen, H.-H., and Chou, J. Kubeshare: A framework to manage
gpus as first-class and shared resources in container cloud. In Proceedings
of the 29th International Symposium on High-Performance Parallel and Distributed
Computing (2020), pp. 173–184.
[25] Zhao, H., Han, Z., Yang, Z., Zhang, Q., Yang, F., Zhou, L., Yang, M., Lau,
F. C., Wang, Y., Xiong, Y., et al. {HiveD}: Sharing a {GPU} cluster for deep
learning with guarantees. In 14th USENIX Symposium on Operating Systems
Design and Implementation (OSDI 20) (2020), pp. 515–532.
[26] Zhu, X., Gong, L., Zhu, Z., and Zhou, X. Vapor: A gpu sharing scheduler
with communication and computation pipeline for distributed deep learning. In
2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications,
Big Data & Cloud Computing, Sustainable Computing & Communications,
Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
(2021), IEEE, pp. 108–116.

簡易檢索 / 詳目顯示

相關論文