簡易檢索 / 詳目顯示

研究生: 張任毅
Chang, Jen-I
論文名稱: 多階層行動網路之超低延遲分散式深度神經網路
Ultra-Low-Latency Distributed Deep Neural Network over Hierarchical Mobile Networks
指導教授: 陳文村
Chen, Wen-Tsuen
許健平
Sheu, Jang-Ping
口試委員: 曾煜棋
Tseng, Yu-Chee
郭建志
Kuo, Jian-Jhih
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 33
中文關鍵詞: 多階層行動網路深度神經網路模型分割模型部署早期推斷
外文關鍵詞: Hierarchical mobile network, Deep neural network, Model partition, Model deployment, Early inference
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來因為霧運算(Fog Computing) 及行動邊緣運算(Mobile Edge Computing) 的出現,多階層的行動網路(Hierarchical Mobile Networks)架構逐漸被重視。先前的研究指出,基於此環境下發展出的切割深度神經網路模型(Deep Neural Network Model Partition) 的分散式部署技術能有效減少深度神經網路的反應時間。此外,若能同時搭配早期推斷技術(Early-Inference Technique),可以更進一步地增進其反應效率。然而,不嚴謹的模型切割以及早期推斷技術的濫用可能導致神經網路的反應時間不減反增,加上現今研究大多專注於早期推論技術中的分類器研發,尚未探索分類器的最佳部署位置及如何切割與部署神經網路模型。
    在本論文中,我們提出了一個最佳化問題,名為 DEMAND-OPE 同時考量深度神經網路的反應時間及其輸出量(Throughput)。為了探索該問題的基本特性,我們首先在不考慮早期推斷技術的情況下設計了一個演算法,名為 COLT 以解決簡化後的 DEMAND-OPE 問題(名為DEMAND)。接著,為了權衡計算時間和傳輸時間,我們延伸 COLT 的概念提出另一個演算法,名為 COLT-OPE,此演算法額外考量了早期推斷技術在多階層移動網路中的部署,以達成更短的反應時間。最後,本論文將 DEMAND-OPE 推廣到DEMAND-OPME。該問題的情境允許更有彈性地部署早期推論的分類器,進一步減少數據傳輸時間並增加早期推理的機率。模擬結果顯示,我們的演算法(COLT-OPE) 優於其他傳統方法(例如:全部部署在終端設備或雲端)大約200 %。


    Traditionally, an input data (eg., image) is inevitable to travel the entire Deep NeuralNetwork (DNN) model (eg., AlexNet) to make an inference. Recently, the notion of parti-tioning the model over multi-level computing units between the user devices and the cloudwith the early-inference technique has been proposed to shorten the response time. Thecomputing units form a hierarchical mobile network to provide locality-aware computa-tion, and the early-inference technique allows the prediction results to early exit the modelwith a probability. However, an inadequate model partition and misapply early inferencemay prolong the response time. Previous studies focus on the classifier design for earlyinference, and thus, the optimal model partition with classifier deployment has not been ex-plored. At the same time, to our best knowledge, this thesis makes the first attempt to focuson accelerating the response time of DNN deployment on the multi-level mobile network.This thesis studies DEMAND-OPE to solve the problem considering both response timeand throughput. Due to the intractability of DEMAND-OPE, we first design the COLTfor the simplified DEMAND-OPE without Optional Exit Points (DEMAND) to examinethe trade-off between the computing time and data transfer time. Afterward, an extensiontermed COLT-OPE is developed to achieve lower response latency. Finally, we general-ize DEMAND-OPE to DEMAND-OPME to further reduce data transfer time and increasethe probability for early inference. Simulation results show our algorithms (COLT-OPE)outperform other methods by 200%.

    1. Introduction --------------------------------------1 2. Related Work --------------------------------------4 2.1 Fog Computing and Edge Computing ---------------4 2.2 Distributed Deep Neural Network ----------------5 2.3 Deep Neural Network Extensions -----------------5 2.4 Early-Inference Technique ----------------------6 3. Model Partition with Classifier Problem -----------7 3.1 Problem Formulation ----------------------------7 3.2 Example ----------------------------------------11 3.3 Summary ----------------------------------------12 4. Completion Time Aware Algorithm -------------------13 4.1 Completion-Time-AwareLayer DeploymentAlgorithm -13 4.2 COLT withOptionalExit Points -------------------15 4.3 COLT withOptionalMultipleExit Points -----------19 4.4 Summary ----------------------------------------22 5. Performance Evaluation ----------------------------23 5.1 Completion Time --------------------------------24 5.2 The Efficiency of Exit Point Deployment --------26 5.3 The Effectiveness of Random Sampling Technique -26 5.4 Latency of Computing and Data Transfer ---------27 5.5 The Effectiveness of Multiple Exit Points ------28 6.Conclusions ----------------------------------------29 7.Bibliography -------------------------------------31

    [1] H. Jeong, I. Jeong, H. Lee, and S. Moon, “Computation offloading for machine learning web apps in the edge server environment,” in Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS), 2018.
    [2] J. Wang, B. Cao, P. Yu, L. Sun, W. Bao, and X. Zhu, “Deep learning towards mobile applications,” in Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS), 2018.
    [3] C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, M. Yunsheng, S. Chen, and P. Hou, “A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure,” IEEE Transactions on Services Computing, vol. 11, pp. 249–261, Mar. 2018.
    [4] X. Li, W. Wu, and H. Su, “Convolutional neural networks based multi-task deep learning for movie review classification,” in Proceedings of IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2017.
    [5] B. Qi, M. Wu, and L. Zhang, “A DNN-based object detection system on mobile cloud computing,” in Proceedings of International Symposium on Communications and Information Technologies (ISCIT), 2017.
    [6] T. Mitani, H. Fukuoka, Y. Hiraga, T. Nakada, and Y. Nakashima, “Compression and aggregation for optimizing information transmission in distributed CNN,” in Proceedings of International Symposium on Computing and Networking (CANDAR), 2017.
    [7] (2019). Cisco visual networking index: Global mobile data traffic forecast update. White paper.
    [8] C. Marquez, M. Gramaglia, M. Fiore, A. Banchs, and X. Costa-Perez, “How should I slice my network?: A multi-service empirical evaluation of resource sharing efficiency,” in Proceedings of International Conference on Mobile Computing and Networking (MobiCom), 2018.
    [9] P. Rost, A. Banchs, I. Berberana, M. Breitbach, M. Doll, H. Droste, C. Mannweiler, M. A. Puente, K. Samdanis, and B. Sayadi, “Mobile network architecture evolution toward 5G,” IEEE Communications Magazine, vol. 54, pp. 84–91, May 2016.
    [10] T. G. Rodrigues, K. Suto, H. Nishiyama, and N. Kato, “Hybrid method for minimizing service delay in edge cloud computing through VM migration and transmission power control,” IEEE Transactions on Computers, vol. 66, pp. 810–819, May 2017.
    [11] S. Teerapittayanon, B. McDanel, and H. T. Kung, “Distributed deep neural networks over the cloud, the edge and end devices,” in Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS), 2017.
    [12] S. Teerapittayanon, B. McDanel, and H. Kung, “BranchyNet: Fast inference via early exiting from deep neural networks,” in Proceedings of IEEE International Conference on Pattern Recognition (ICPR), 2016.
    [13] G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” in Proceedings of IEEE/ACM International Symposium on Mixed and Augmented Reality (ISMAR), 2007.
    [14] P. Mach and Z. Becvar, “Mobile edge computing: A survey on architecture and computation offloading,” IEEE Communications Surveys Tutorials, vol. 19, pp. 1628–1656, 2017.
    [15] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case for vm-based cloudlets in mobile computing,” IEEE Pervasive Computing, vol. 8, pp. 14–23, Oct. 2009.
    [16] T. Taleb and A. Ksentini, “Follow me cloud: Interworking federated clouds and distributed mobile networks,” IEEE Network, vol. 27, pp. 12–19, Sep. 2013.
    [17] J. Soares, C. Gonçalves, B. Parreira, P. Tavares, J. Carapinha, J. P. Barraca, R. L. Aguiar, and S. Sargento, “Toward a telco cloud environment for service functions,”IEEE Communications Magazine, vol. 53, pp. 98–106, Feb. 2015.
    [18] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog computing and its role in the internet of things,” in Proceedings of ACM SIGCOMM Workshop on Mobile Cloud Computing (MCC), 2012.
    [19] S. Yi, Z. Hao, Z. Qin, and Q. Li, “Fog computing: Platform and applications,” in Proceedings of IEEE/ACM SEC Workshop on Hot Topics in Web Systems and Technologies (HotWeb), 2015.
    [20] K. Skala, D. Davidovic, E. Afgan, I. Sovic, and Z. Sojat, “Scalable distributed computing hierarchy: Cloud, fog and dew computing,” Open Journal of Cloud Computing, vol. 2, pp. 16–24, Dec. 2015.
    [21] J. Xu, L. Chen, and P. Zhou, “Joint service caching and task offloading for mobile edge computing in dense networks,” in Proceedings of IEEE Conference on Computer Communications (INFOCOM), 2018.
    [22] L. Wang, L. Jiao, T. He, J. Li, and M. Mühlhäuser, “Service entity placement for social virtual reality applications in edge computing,” in Proceedings of IEEE Conference on Computer Communications (INFOCOM), 2018.
    [23] S.Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “When edge meets learning: Adaptive control for resource-constrained distributed machine learning,” in Proceedings of IEEE Conference on Computer Communications (INFOCOM),2018.
    [24] X. Ran, H. Chen, X. Zhu, Z. Liu, and J. Chen, “DeepDecision: A mobile deep learning framework for edge video analytics,” in Proceedings of IEEE Conference on Computer Communications (INFOCOM), 2018.
    [25] J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Y. Ng, “Large scale distributed deep networks,” in Proceedings of International Conference on Neural Information Processing Systems (NIPS), 2012.
    [26] F. N. Iandola, M.W. Moskewicz, K. Ashraf, and K. Keutzer, “FireCaffe: Near-linear acceleration of deep neural network training on compute clusters,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
    [27] J. Mao, Z. Qin, Z. Xu, K. W. Nixon, X. Chen, H. Li, and Y. Chen, “Adalearner: An adaptive distributed mobile learning system for neural networks,” in Proceedings of IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017.
    [28] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” in Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
    [29] M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: Training deep neural networks with binary weights during propagations,” in Proceedings of International Conference on Neural Information Processing Systems (NIPS), 2015.
    [30] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: Imagenet classification using binary convolutional neural networks,” in Proceedings of European Conference on Computer Vision (ECCV), 2016.
    [31] B. McDanel, S. Teerapittayanon, and H. Kung, “Embedded binarized neural networks,” in Proceedings of International Conference on Embedded Wireless Systems and Networks (EWSN), 2017.
    [32] P. Panda, A. Sengupta, and K. Roy, “Energy-efficient and improved image recognition with conditional deep learning,” ACM Journal on Emerging Technologies in Computing Systems, vol. 13, 33:1–33:21, Feb. 2017.
    [33] C. Lo, Y. Su, C. Lee, and S. Chang, “A dynamic deep neural network design for efficient workload allocation in edge computing,” in Proceedings of IEEE International Conference on Computer Design (ICCD), 2017.
    [34] M. Figurnov, M. D. Collins, Y. Zhu, L. Zhang, J. Huang, D. Vetrov, and R. Salakhutdinov, “Spatially adaptive computation time for residual networks,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

    QR CODE