簡易檢索 / 詳目顯示

研究生: 謝佳穎
Hsieh, Chia-Ying
論文名稱: 支援物與雲間連續部署深度神經網路的多用戶系統
A Multi-Tenant System for Deploying Deep Neural Networks in a Thing-to-Cloud Continuum
指導教授: 徐正炘
Hsu, Cheng-Hsin
口試委員: 黃俊龍
Huang, Jiun-Long
許健平
Sheu, Jang-Ping
楊舜仁
Yang, Shun-Ren
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 111
語文別: 英文
論文頁數: 58
中文關鍵詞: 物聯網分散式深度學習邊緣運算
外文關鍵詞: Internet-of-things, distributed deep learning, multi-task learning
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於深度神經網路的物聯網分析應用愈來愈普及。過去我們習慣將深度神經網路這種需要高強度運算的工作佈建到雲伺服器進行運算,然而,隨著物聯網設備的普及,他們產生的資料量也跟著增加,將資料全部送到雲伺服器進行運算反而會因網路壅塞增加延遲。因此,隨著物聯網設備的規格提升,學術界提出將深度神經網路切割並在不同設備上執行。在本篇論文中我們提出了一個提供多用戶在物到雲連續部署深度神經網路的系統。為了最大化服務的需求數量,我們使用了(一)多任務(multi-task)、(二)順帶服務(hitchhiking)、(三)可伸縮終點(early exit)與(四)重新調整(reconfiguration)四個功能。我們將深度神經網路的佈建決策過程分成規劃與執行兩階段,並在各階段提出演算法來解決問題。在規劃階段,我們根據當下的資源狀況決定佈建計畫;在執行階段,我們根據服務品質與當下資源狀況決定是否重新調整已佈建的模型的計畫來達到更好的服務品質。最後我們實作了一個實驗性平台來評估我們提出的系統。實驗結果表明,在規劃階段,我們的系統比起其他作法提升了 6.8 倍的服務數量;在執行階段,我們可以提升 35\% 的服務滿意度。我們更觀察到(ㄧ)多任務和順帶服務提升了 5.4 倍的服務量、(二)可伸縮終點在不違反準確度需求的條件下降低了延遲、(三)我們的系統在高工作量的環境下有更好的表現。因此,我們建議在一般狀況下使用多任務、順帶服務與可伸縮終點功能,然後在環境變化大的狀況下使用重新調整功能。


    Deep Neural Networks (DNN) based IoT analytics is getting popular. With the growing amount of IoT sensor data, offloading the computation to the cloud becomes inefficient due to traffic congestion. With the improved capabilities of IoT devices, the concept of dividing DNN among IoT devices, edge servers, and cloud servers is proposed. In this thesis, we propose a multi-tenant system, called T2C, to dynamically choose, deploy, monitor, and control IoT analytics implemented via DNN in a thing-to-cloud continuum. T2C leverages (i) multi-task, (ii) hitchhiking, (iii) early exit, and (iv) reconfiguration to maximize the number of served user requests, satisfying the accuracy and latency requirements. We divide the deployment decision-making process into the planning and operation phases. In the planning phase, we make the deployment plan under the current resource status. In the operation phase, we check model's runtime Quality-of-Service (QoS) and decide if we should conduct a reconfiguration for better performance. We propose a suite of deployment planning and dynamic reconfiguration algorithms to dynamically deploy and migrate layers among the IoT device, edge server, and cloud server. We implement our proposed system in a prototype testbed. The results show that our system: (i) achieves a 6.8X throughput boost compared to baseline algorithms in the planning phase and (ii) improves the satisfied ratio by up to 35\% in the operation phase. Furthermore, we observe that (i) multi-task and hitchhiking improve throughput by up to 5.4X, (ii) early exits reduce latency without violating accuracy requirements, and (iii) T2C leads to a higher performance boost under a higher workload. Hence, we suggest using T2C with multi-task, hitchhiking, and early exits in normal cases and enabling reconfiguration if the environment is highly dynamic.

    中文摘要 i Abstract ii 致謝 iii Acknowledgments iv 1 Introduction 1 1.1 Contributions ................................ 3 1.2 Limitations ................................. 5 1.3 ThesisOrganizations ............................ 5 2 Background 6 2.1 Internet-of-Things.............................. 6 2.2 DeepNeuralNetworks ........................... 7 2.3 CloudComputing.............................. 8 2.4 EdgeComputing .............................. 9 2.5 Thing-to-CloudContinuum......................... 10 3 Related Work 12 3.1 Multi-taskNetworks ............................ 12 3.2 EarlyExit.................................. 13 3.3 DynamicReconfiguration.......................... 13 3.4 DNNDeployment.............................. 14 3.5 IoTAnalyticsDeployment ......................... 15 4 System Overview 16 4.1 Components................................. 17 4.2 Workflow .................................. 18 5 Planning Phase: Deployment Planning 19 5.1 Notations .................................. 20 5.2 SystemModels ............................... 21 5.3 Formulation................................. 22 5.4 OurProposedAlgorithm .......................... 23 6 Operation Phase: Dynamic Reconfiguration 25 6.1 Problem................................... 25 6.2 OurProposedAlgorithm .......................... 26 7 Implementations 28 7.1 Testbed ................................... 28 7.2 Kubernetes and Model Deployment .................... 29 7.3 Multi-taskModels.............................. 30 8 Evaluations 32 8.1 Setup .................................... 32 8.2 PlanningPhaseResults ........................... 34 8.3 OperationPhaseResults .......................... 41 8.4 Implications of System Parameters..................... 47 8.5 Discussion and Recommendations ..................... 48 9 Conclusion 49 9.1 FutureWork................................. 50 Bibliography 52

    [1] 2022 K3s project authors. Lightweight Kubernetes, 2022. https://k3s.io/.
    [2] 2022 Kubernetes authors. Kubernetes, 2022. http://kubernetes.io/.
    [3] 2022 MQTT.org. MQTT, 2022. https://mqtt.org/.
    [4] 2022 ZeroMQ authors. ZeroMQ, 2022. https://zeromq.org/.
    [5] AAEON Technology Inc. Upboard, 2022. https://up-board.org/up/specifications/.
    [6] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Is- ard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane ́, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vie ́gas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
    [7] K. Ashton et al. That ‘Internet of Things’ thing. RFID Journal, 22(7):97–114, 2009.
    [8] L. Atzori, A. Iera, and G. Morabito. The Internet of Things: A survey. Computer
    Networks, 54(15):2787–2805, May 2010.
    [9] D. Bernstein. Containers and cloud: From LXC to docker to Kubernetes. IEEE
    Cloud Computing, 1(3):81–84, September 2014.
    [10] L. Bittencourt, R. Immich, R. Sakellariou, N. Fonseca, E. Madeira, M. Curado, L. Villas, L. DaSilva, C. Lee, and O. Rana. The Internet of Things, fog and cloud continuum: Integration and challenges. Elservier Internet of Things, 3:134–155, October 2018.
    [11] A. Bochkovskiy, C. Wang, and H. Liao. YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
    52
    [12] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli. Fog computing and its role in the internet of things. In Proc. of the International Workshop on Mobile Cloud Computing (MCC’12), pages 13–16, Helsinki, Finland, August 2012.
    [13] V. Cardellini, V. Grassi, F. Presti, and M. Nardelli. Optimal operator placement for distributed stream processing applications. In Proc. of the ACM International Con- ference on Distributed and Event-based Systems (DEBS’16), pages 69–80, Irvine, California, June 2016.
    [14] V. Cardellini, V. Grassi, L. Presti, and M. Nardelli. On QoS-aware scheduling of data stream applications over fog computing infrastructures. In Proc. of IEEE Symposium on Computers and Communication (ISCC’15), pages 271–276, Larnaca, Cyprus, July 2015.
    [15] R. Caruana. Multitask learning. Springer Machine Learning, 28(1):41–75, July 1997.
    [16] M. Chao, R. Stoleru, L. Jin, S. Yao, M. Maurice, and R. Blalock. AMVP: Adap- tive CNN-based multitask video processing on mobile stream processing platforms. In Proc. of IEEE/ACM Symposium on Edge Computing (SEC’20), pages 96–109, Virtual, November 2020.
    [17] L. Deng, J. Li, J. Huang, K. Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams, et al. Recent advances in deep learning for speech research at Mi- crosoft. In Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13), pages 8604–8608, Vancouver, Canada, May 2013.
    [18] A. Esteva, B. Kuprel, R. Novoa, J. Ko, S. Swetter, H. Blau, and S. Thrun. Dermatologist-level classification of skin cancer with deep neural networks. Na- ture, 542(7639):115–118, Januray 2017.
    [19] B. Fang, X. Zeng, and M. Zhang. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proc. of the Annual International Conference on Mobile Computing and Networking (MobiCom’18), pages 115–127, New Delhi, India, October 2018.
    [20] T. Goethals, F. Turck, and B. Volckaert. FLEDGE: Kubernetes compatible container orchestration on low-resource edge devices. In Proc. of International Conference on Internet of Vehicles (IOV’19), pages 174–189, Kaohsiung, Taiwan, November 2019.
    [21] Y. Han, S. Shen, X. Wang, S. Wang, and V. Leung. Tailored learning-based schedul- ing for Kubernetes-oriented edge-cloud system. In Proc. of IEEE Conference on Computer Communications (INFOCOM’21), pages 1–10, Vancouver, Canada, May 2021.
    [22] H. Hong, P. Tsai, A. Cheng, M. S. Uddin, N. Venkatasubramanian, and C. Hsu. Sup- porting internet-of-things analytics in a fog computing platform. In Proc. of IEEE International Conference on Cloud Computing Technology and Science (Cloud- Com’17), pages 138–145, Hong Kong, China, December 2017.
    [23] H. Hong, P. Tsai, and C. Hsu. Dynamic module deployment in a fog computing platform. In Proc. of Asia-Pacific Network Operations and Management Symposium (APNOMS’16), pages 1–6, Kanazawa, Japan, October 2016.
    [24] J. Howarth. 80+ amazing IoT statistics (2022-2030), 2022. https://explodingtopics. com/blog/iot-stats.
    [25] K. Hsu, K. Bhardwaj, and A. Gavrilovska. Couper: DNN model slicing for visual analytics containers at the edge. In Proc. of the ACM/IEEE Symposium on Edge Computing (SEC’19), pages 179–194, Arlington, Virginia, November 2019.
    [26] C. Hu, W. Bao, D. Wang, and F. Liu. Dynamic adaptive DNN surgery for inference acceleration on the edge. In Proc. of IEEE International Conference on Computer Communications (INFOCOM’19), pages 1423–1431, Paris, France, April 2019.
    [27] A. Jiang, D. Wong, C. Canel, L. Tang, I. Misra, M. Kaminsky, M. Kozuch, P. Pillai, D. Andersen, and G. Ganger. Mainstream: Dynamic stem-sharing for multi-tenant video processing. In Proc. of Internatioal USENIX Annual Technical Conference (ATC’18), pages 29–42, Boston, Massachusetts, July 2018.
    [28] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang. Neu- rosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News, 45(1):615–629, April 2017.
    [29] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep con- volutional neural networks. Communications of the ACM, 60(6):84–90, June 2017.
    [30] G. Lakshmanan, Y. Li, and R. Strom. Placement strategies for Internet-scale data stream systems. IEEE Internet Computing, 12(6):50–60, November 2008.
    [31] S. Laskaridis, S. Venieris, M. Almeida, I. Leontiadis, and N. Lane. Spinn: syner- gistic progressive inference of neural networks over device and cloud. In Proc. of the Annual International Conference on Mobile Computing and Networking (Mobi- Com’20), pages 1–15, London, UK, April 2020.
    [32] Y. Le Cun, L. Jackel, B. Boser, J. Denker, H. Graf, I. Guyon, D. Henderson, R. Howard, and W. Hubbard. Handwritten digit recognition: Applications of neural network chips and automatic learning. IEEE Communications Magazine, 27(11):41–46, November 1989.
    [33] E. Li, L. Zeng, Z. Zhou, and X. Chen. EdgeAI: Onㄦdemand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications, 19(1):447–457, October 2019.
    [34] H. Li, C. Hu, J. Jiang, Z. Wang, Y. Wen, and W. Zhu. JALAD: Joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution. In Proc. of IEEE International Conference on Parallel and Distributed Systems (ICPADS’18), pages 671–678, Singapore, Singapore, December 2018.
    [35] Y. Ma, J. Wu, and C. Long. DasNet: Dynamic adaptive structure for accelerating multi-task convolutional neural network. In Proc. of International Conference on Neural Information Processing (ICONIP’19), pages 138–150, Sydney, Australia, December 2019.
    [36] A. Majeed, P. Kilpatrick, I. Spence, and B. Varghese. NEUKONFIG: Reducing edge service downtime when repartitioning DNNs. In Proc. of IEEE International Conference on Cloud Engineering (IC2E’21), pages 118–125, San Francisco, California, October 2021.
    [37] J. Mao, X. Chen, K. Nixon, C. Krieger, and Y. Chen. MoDNN: Local distributed mobile computing system for deep neural network. In Proc. of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE’17), pages 1396–1401, Lausanne, Switzerland, March 2017.
    [38] A. Mathur, N. Lane, S. Bhattacharya, A. Boran, C. Forlivesi, and F. Kawsar. Deep- eye: Resource efficient local execution of multiple deep vision models using wear- able commodity hardware. In Proc. of the Annual International Conference on Mo- bile Systems, Applications, and Services (MobiSys’17), pages 68–81, Niagara Falls, New York, June 2017.
    [39] F. McNamee, S. Dustdar, P. Kilpatrick, W. Shi, I. Spence, and B. Varghese. The case for adaptive deep neural networks in edge computing. In Proc. of IEEE International Conference on Cloud Computing (CLOUD’21), pages 43–52, Chicago, Illinois, September 2021.
    [40] P. Mell, T. Grance, et al. The NIST definition of cloud computing. September 2011. https://csrc.nist.gov/publications/detail/sp/800-145/final.
    [41] D. Merkel et al. Docker: lightweight Linux containers for consistent development and deployment. Linux Journal, 239, March 2014.
    [42] T. Mohammed, C. Joe-Wong, R. Babbar, and M. Francesco. Distributed inference acceleration with adaptive DNN partitioning and offloading. In Proc. of IEEE Conference on Computer Communications (INFOCOM’20), pages 854–863, Toronto, Canada, July 2020.
    [43] M. Nardelli, V. Cardellini, V. Grassi, and F. Presti. Efficient operator placement for distributed data stream processing applications. IEEE Transactions on Parallel and Distributed Systems, 30(8):1753–1767, January 2019.
    [44] PABLOMONLEON. 311 service requests NYC, 2022. https://www.kaggle.com/ datasets/pablomonleon/311-service-requests-nyc/.
    [45] S.Pan and Q. Yang. A survey on transfer learning. IEEE Transactionson Knowledge and Data Engineering, 22(10):1345–1359, October 2009.
    [46] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
    [47] S. Ruder. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098, June 2017.
    [48] M. Satyanarayanan. The emergence of edge computing. Computer, 50(1):30–39, January 2017.
    [49] Sergey Bondarev. Top 4 enterprise IoT trends of 2022 and beyond, 2022. https: //www.voltactivedata.com/blog/2022/06/top-4-enterprise-iot-trends-of-2022/.
    [50] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. Edge computing: Vision and challenges. IEEE Internet of Things Journal, 3(5):637–646, June 2016.
    [51] W. Shi and S. Dustdar. The promise of edge computing. Computer, 49(5):78–81, May 2016.
    [52] S. Singh, A. Kumar, H. Darbari, L. Singh, A. Rastogi, and S. Jain. Machine trans- lation using deep learning: An overview. In Proc. of International Conference on Computer, Communications and Electronics (Comptelix’17), pages 162–167, Jaipur, India, July 2017.
    [53] R. Socher, B. Huval, B. Bath, C. Manning, and A. Ng. Convolutional-recursive deep learning for 3D object classification. Advances in Neural Information Processing Systems, 25, 2012.
    [54] V. Sze, Y. Chen, T. Yang, and J. Emer. Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12):2295–2329, November 2017.
    [55] S. Teerapittayanon, B. McDanel, and H. Kung. Branchynet: Fast inference via early exiting from deep neural networks. In Proc. of IEEE International Conference on Pattern Recognition (ICPR’16), pages 2464–2469, Cancun, Mexico, December 2016.
    [56] S. Teerapittayanon, B. McDanel, and H. Kung. Distributed deep neural networks over the cloud, the edge and end devices. In Proc. of IEEE International Conference on Distributed Computing Systems (ICDCS’17), pages 328–339, Atlanta, Georgia, July 2017.
    [57] L. Toka, G. Dobreff, B. Fodor, and B. Sonkoly. Machine learning-based scaling management for kubernetes edge clusters. IEEE Transactions on Network and Ser- vice Management, 18(1):958–972, January 2021.
    [58] D. Torres, C. Mart ́ın, B. Rubio, and M. D ́ıaz. An open source framework based on Kafka-ML for distributed DNN inference over the cloud-to-things continuum. Journal of Systems Architecture, 118:102214, September 2021.
    [59] P. Tsai, H. Hong, A. Cheng, and C. Hsu. Distributed analytics in fog computing platforms using Tensorflow and Kubernetes. In Proc. of Asia-Pacific Network Operations and Management Symposium (APNOMS’17), pages 145–150, Seoul, Korea, September 2017.
    [60] S. Viet and C. Bao. Effective deep multi-source multi-task learning frameworks for smile detection, emotion recognition and gender classification. Informatica, 42(3), September 2018.
    [61] L. Yang, J. Cao, S. Tang, D. Han, and N. Suri. Run time application repartitioning in dynamic mobile cloud environments. IEEE Transactions on Cloud Computing, 4(3):336–348, September 2014.
    [62] W. Yin, K. Kann, M. Yu, and H. Schutze. Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923, 2017.
    [63] Z. Zhao, K. Barijough, and A. Gerstlauer. DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(11):2348–2359, November 2018.

    QR CODE