ALBERT: 利用自動學習方法優化 Hadoop 執行與資源使用之計算管理系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳振群 Chen, Chen Chun
論文名稱：	ALBERT: 利用自動學習方法優化 Hadoop 執行與資源使用之計算管理系統 ALBERT: an Automatic Learning Based Execution and Resource Management System for Hadoop
指導教授：	周志遠 Chou, Jerry Chi-Yuan
口試委員:	金仲達 King, Chung-Ta 李哲榮 Lee, Che-Rung
學位類別：	碩士 Master
系所名稱：
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	36
中文關鍵詞：	資料分析、深度學習、時間預測、優化、工作排程
外文關鍵詞：	Data Analytic, Deep Learning, Time Prediction, Optimization, Job Scheduling
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

Hadoop是一個常用的計算框架，可以在大型商品叢集上提供及時且符合經濟效益的數據處理。它減輕了程式開發者處理分散式程式的負擔，並且圍繞它發展出了一個大數據解決方案的生態系統。然而，Hadoop的作業執行時間很大程度上取決於其運行時配置和資源選擇。Hadoop提供了超過100個作業參數設置，以及雲或虛擬化計算環境中不同的資源實例選項，運行Hadoop作業仍需要大量的專業知識和經驗。為了因應這些挑戰，我們利用深度神經網絡及基於歷史執行數據來預測Hadoop作業時間，並且提出了優化方法來減少作業執行的時間和成本。結果證實，我們的預測方法達到了將近90％的時間預測精準度，並明顯超出了其他三種最先進的基於回歸的預測方法。基於時間預測，我們提出的配置搜索方法和作業調度演算法成功地將單個Hadoop作業的執行時間縮短了2倍以上，並且將處理一批Hadoop作業的執行成本降低2.7倍以上，與此同時，無需額外的人為知識或介入。

Hadoop is a popular computing framework to deliver timely and cost-effective data processing on a large cluster of commodity machines. It relieves the burden of the programmers dealing with distributed programming, and an ecosystem of Big Data solutions have developed around it. However, Hadoop's job execution time can be greatly depending on its runtime configurations and resource selections. Given more than 100 job configuration settings from Hadoop, and diverse resource instance options in a cloud or virtualized computing environment, running Hadoop jobs still requires a substantial amount of expertise and experience. To address this challenge, we applied deep neural network to predict Hadoop job time based on historical execution data, and we proposed optimization methods to reduce job execution time and cost. The results showed that our prediction method achieved almost 90\% of time prediction accuracy and clearly out-performed three other state-of-art regression-based prediction methods. Based on the time prediction, our proposed configuration search method and job scheduling algorithm successfully shorten the execution time of a single Hadoop job by more than 2 times and reduce the execution cost of processing a batch of Hadoop jobs by more than 2.7 times without relying on any human knowledge and intervention.

Introduction 1

Design and Implementation of ALBERT 4
1 On-demand Hadoop Execution Environment ............. 4
2 Data Driven and Self-Learning Approach .............. 6
3 Implementation on Cloud Platform .................. 7

Deep Learning Job Time Prediction Method 9
1 Profiler ................................ 9
2 Classifier ............................... 10
3 Predictor ................................ 11

Job and System Optimization Methods 12
1 Job Level Optimization ........................ 12
2 System Level Optimization ...................... 13

Experimental Setup 19
1 Environment & Setup ......................... 19
2 State-of-the-art Comparison ..................... 20


Experimental Evaluation 22
1 Job Classification ........................... 22
2 Time Prediction ............................ 24
3 Job & System Optimization ...................... 25
3.1 Job Level Optimization .................... 26
3.2 System Level Optimization .................. 28

Related Work 31
Conclusion 33
References 34
                                

[1] Altwaijry, H., Trulls, E., Hays, J., Fua, P., and Belongie, S. Learning to match aerial images with deep attentive architectures. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016), pp. 3539– 3547.
[2] Apache. Apache giraph. [Online]. Available: http://giraph.apache.org/.
[3] Apache. Apache hadoop. [Online]. Available: http://hadoop.apache.org/.
[4] Apache. Apache hbase. [Online]. Available: http://hbase.apache.org/.
[5] Apache. Apache mahout. [Online]. Available: http://mahout.apache.org/.
[6] AWS. Aws. [Online]. Available: https://aws.amazon.com/tw/.
[7] Chen, C. O., Zhuo, Y. Q., Yeh, C. C., Lin, C. M., and Liao, S. W. Machine learning-based configuration parameter tuning on hadoop system. In 2015 IEEE International Congress on Big Data (June 2015), pp. 386–392.
[8] Dean, J., and Ghemawat, S. Mapreduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (Jan. 2008), 107–113.
[9] EMR. Emr. [Online]. Available: https://aws.amazon.com/tw/emr/.
[10] Gandhi, A., Thota, S., Dube, P., Kochut, A., and Zhang, L. Autoscaling for hadoop clusters. In 2016 IEEE International Conference on Cloud Engineering (IC2E) (April 2016), pp. 109–118.
[11] Giri, R., Seltzer, M. L., Droppo, J., and Yu, D. Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learn- ing. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (April 2015), pp. 5014–5018.
[12] Herodotou,H.,Lim,H.,Luo,G.,Borisov,N.,Dong,L.,Cetin,F.B.,andBabu, S. Starfish: A self-tuning system for big data analytics. In In CIDR (2011), pp. 261–272.
[13] Huang, S., Huang, J., Dai, J., Xie, T., and Huang, B. The hibench bench- mark suite: Characterization of the mapreduce-based data analysis. In 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010) (March 2010), pp. 41–51.
[14] JessicaGoepfert,e.a.Theworldwidesemiannualbigdataandanalyticsspend- ing guide.
[15] Jylänki, J. A thousand ways to pack the bin – a practical approach to two- dimensional rectangle bin packing, 2010.
[16] Kadirvel, S., and Fortes, J. A. B. Grey-box approach for performance predic- tion in map-reduce based platforms. In 2012 21st International Conference on Computer Communications and Networks (ICCCN) (July 2012), pp. 1–9.
[17] Lama, P., and Zhou, X. Aroma: Automated resource allocation and configu- ration of mapreduce environment in the cloud. In Proceedings of the 9th Inter- national Conference on Autonomic Computing (New York, NY, USA, 2012), ICAC ’12, ACM, pp. 63–72.
[18] Lample, G., and Chaplot, D. S. Playing FPS games with deep reinforcement learning. CoRR abs/1609.05521 (2016).
[19] OpenStack. Openstack. [Online]. Available: http://www.openstack.org/.
[20] Rackspace. Rackspace: Managed dedicated & cloud computing services. [On-
line]. Available: https://www.rackspace.com/.
[21] Shara. Shara project. [Online]. Available: https://wiki.openstack.org/wiki/
Sahara/.
[22] Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., and Murthy, R. Hive: A warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2, 2 (Aug. 2009), 1626–1629.
[23] Vavilapalli, V. K., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., and Baldeschwieler, E. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th Annual Symposium on Cloud Computing (New York, NY, USA, 2013), SOCC ’13, ACM, pp. 5:1–5:16.
[24] Verma, A., Cherkasova, L., and Campbell, R. H. Aria: Automatic resource inference and allocation for mapreduce environments. In Proceedings of the 8th ACM International Conference on Autonomic Computing (New York, NY, USA, 2011), ICAC ’11, ACM, pp. 235–244.
[25] Wang, G., Butt, A. R., Pandey, P., and Gupta, K. A simulation approach to evaluating design decisions in mapreduce setups. In 2009 IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommuni- cation Systems (Sept 2009), pp. 1–11.
[26] Yang, H., Luan, Z., Li, W., Qian, D., and Guan, G. Statistics-based work- load modeling for mapreduce. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum (May 2012), pp. 2043–2051.
[27] Zhang, Z., Cherkasova, L., and Loo, B. T. Autotune: Optimizing execution concurrency and resource usage in mapreduce workflows. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC 13) (San Jose, CA, 2013), USENIX, pp. 175–181.
[28] Zhang,Z.,Cherkasova,L.,andLoo,B.T.Benchmarkingapproachfordesign- ing a mapreduce performance model. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (New York, NY, USA, 2013), ICPE ’13, ACM, pp. 253–258.

簡易檢索 / 詳目顯示

相關論文