研究生: |
伍銘基 |
---|---|
論文名稱: |
資源控管與進度平衡之MapReduce排程機制 Resource-Aware and Mismatch Controlling MapReduce Scheduler |
指導教授: | 周志遠 |
口試委員: |
李哲榮
蕭宏章 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 英文 |
論文頁數: | 35 |
中文關鍵詞: | 資源 、不平衡 |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著網路上日漸增多的資料量,能夠處理新數量級問題的方法也
更加重要。在今日的巨量資料時代,Hadoop mapreduce是其中一種被 廣泛運用來處理大量、成長快速資料的重要工具。許多的研究在傳統 的mapreduce架構上提出他們的策略來適應不同的情況,然而,為了 達到真正使用到叢集裡所有資源,我們仍然需要一個新的架構來分配 工作。在這篇論文中,我們研發了一個資源控管策略來克服傳統架構 上的缺點,並提出進度平衡演算法來調控Mapper與Reducer間的進度 不平衡,藉此達到叢集內資源的高利用率。
As more and more data generated every moment over the internet, the requirement of method to solve new scale problems is getting important. Hadoop mapreduce is one of the most important tools which widely used on solving large scale and rapidly growing problems in today’s big data era. Based on traditional mapreduce frame- work, many researches proposed their strategy to adapt different situations. But to actually full use the resource of hadoop cluster, we still require a new framework to allocate tasks. In this thesis, we develop a resource-aware scheduling strategy to overcome the drawbacks of traditional framework, and propose a mismatch control- ling algorithm that coordinates the progress of mapper and reducer to achieve the full usage of resource.
[1] Apache hadoop http://hadoop.apache.org.
[2] X. Bu, J. Rao, and C.-z. Xu. Interference and Locality-aware Task Scheduling for MapReduce Applications in Virtual Clusters. In Proceedings of the 22Nd International Symposium on High-performance Parallel and Distributed Com- puting, HPDC ’13, pages 227–238, New York, NY, USA, 2013. ACM.
[3] J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM, 51(1):107–113, Jan. 2008.
[4] H. Jin, X. Yang, X.-H. Sun, and I. Raicu. ADAPT: Availability-Aware MapRe- duce Data Placement for Non-dedicated Distributed Computing. In Proceedings of the 2012 IEEE 32Nd International Conference on Distributed Computing Systems, ICDCS ’12, pages 516–525, Washington, DC, USA, 2012. IEEE Com- puter Society.
[5] Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. SkewTune: Mitigating Skew in Mapreduce Applications. In Proceedings of the 2012 ACM SIGMOD Inter- national Conference on Management of Data, SIGMOD ’12, pages 25–36, New York, NY, USA, 2012. ACM.
[6] H. Lin, X. Ma, J. Archuleta, W.-c. Feng, M. Gardner, and Z. Zhang. MOON: MapReduce On Opportunistic eNvironments. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC ’10, pages 95–106, New York, NY, USA, 2010. ACM.
[7] B. Palanisamy, A. Singh, L. Liu, and B. Jain. Purlieus: Locality-aware Re- source Allocation for MapReduce in a Cloud. In Proceedings of 2011 Interna-
tional Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 58:1–58:11, New York, NY, USA, 2011. ACM.
[8] J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguad ́e. Resource-Aware Adaptive Scheduling for MapReduce Clus- ters. In F. Kon and A.-M. Kermarrec, editors, Middleware 2011, volume 7049 of Lecture Notes in Computer Science, pages 187–207. Springer Berlin Heidelberg, 2011.
[9] S. R. Ramakrishnan, G. Swart, and A. Urmanov. Balancing Reducer Skew in MapReduce Workloads Using Progressive Sampling. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC ’12, pages 16:1–16:14, New York, NY, USA, 2012. ACM.
[10] B. T. Rao and L. S. S. Reddy. Survey on Improved Scheduling in Hadoop MapReduce in Cloud Environments. CoRR, abs/1207.0780, 2012.
[11] B. Sharma, R. Prabhakar, S. Lim, M. Kandemir, and C. Das. MROrchestrator: A Fine-Grained Resource Orchestration Framework for MapReduce Clusters. In Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on, pages 1–8, June 2012.
[12] J. Tan, X. Meng, and L. Zhang. Performance analysis of Coupling Scheduler for MapReduce/Hadoop. In INFOCOM, 2012 Proceedings IEEE, pages 2586–2590, March 2012.
[13] J. Tan, X. Meng, and L. Zhang. Coupling task progress for MapReduce resource-aware scheduling. In INFOCOM, 2013 Proceedings IEEE, pages 1618– 1626, April 2013.
[14] S. Tang, B.-S. Lee, and B. He. Dynamic slot allocation technique for MapRe- duce clusters. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1–8, Sept 2013.
[15] V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, and E. Baldeschwieler. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC ’13, pages 5:1–5:16, New York, NY, USA, 2013. ACM.
[16] L. Xu. MapReduce Framework Optimization via Performance Modeling. In Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2012 IEEE 26th International, pages 2506–2509, May 2012.
[17] M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Job scheduling for multi-user mapreduce clusters. EECS Department, Univer- sity of California, Berkeley, Tech. Rep. UCB/EECS-2009-55, 2009.
[18] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Sto- ica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proceedings of the 5th European Conference on Com- puter Systems, EuroSys ’10, pages 265–278, New York, NY, USA, 2010. ACM.
[19] M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce Performance in Heterogeneous Environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pages 29–42, Berkeley, CA, USA, 2008. USENIX Association.