簡易檢索 / 詳目顯示

研究生: 洪嘉廷
Hung, Chia-Ting
論文名稱: 針對平行資料分析計算的位置感知分散式快取方法
A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications
指導教授: 周志遠
Chou, Jerry
口試委員: 李哲榮
Che-Rung, Lee
賴冠州
Kuan-Chou, Lai
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 29
中文關鍵詞: 平行數據處理快取分散式系統
外文關鍵詞: Parallel Data Processing, Caching, Distributed Systems
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 記憶體快取長久以來一直用於填補處理器和儲存裝置之間的性能差距,以減少數據密集型計算的數據存取時間。
    以往,關於記憶體快取的研究主要集中在最大化單一機器上的快取命中率。然而在本文中,針對平行數據分系應用,我們認為分散式快取系統應該以協作的方式運行。這些應用經常被許多新興科技,如:大數據和AI(人工智能)所使用,以在更短的時間內對更大量的數據進行數據挖掘及複雜的分析。
    平行數據分析工作由多個平行的子任務所組成。因此,一個工作的完成時間會受其最慢的子任務所限制。這意味著在快取該工作的所有子任務的所有輸入之前,該工作將無法從快取中受益。
    為了解決這個問題,我們提出了一個協作式快取記憶體設計系統。該系統根據數據存取的模式、規律,定期地重新安排節點之間的快取配置,同時考慮任務之間的相依性及網路的局部化特性。
    我們以事件驅動模擬器來評估我們的方法,實驗包含一個合成的工作負載和一個真實世界工作負載。
    結果顯示,與非協作式快取系統相比,我們的方法可以將平均工作完成時間減少多達41%;與其他協作式快取系統相比,可以減少平均工作完成時間多達35%。


    Memory caching has long been used to fill up the performance gap between processor and disk for reducing the data access time of data-intensive computations. Previous studies on caching mostly focus on optimizing the hit rate of a single machine. But in this paper, we argue that the caching decision of a distributed memory system should be performed in a cooperative manner for the parallel data analytic applications, which are commonly used by emerging technologies, such as Big Data and AI (Artificial Intelligence), to perform data mining and sophisticated analytics on larger data volume in a shorter time.
    A parallel data analytic job consists of multiple parallel tasks. Hence, the completion time of a job is bounded by its slowest task, meaning that the job cannot benefit from caching until all inputs of its tasks are cached. To address the problem, we proposed a cooperative caching design that periodically rearranges the cache placement among nodes according to the data access pattern while taking the task dependency and network locality into account. Our approach is evaluated by a trace-driven simulator using both synthetic workload and real-world traces. The results show that we can reduce the average completion times up to 33% compared to a non-collaborative caching polices and 25% compared to other start-of-the-art collaborative caching policies.

    Contents 1 Introduction 1 2 Background 5 2.1 Cooperative Memory Caching System . . . . . . . . . . . . . . . . 5 2.1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Cooperative Caching System Architecture . . . . . . . . . . 6 2.1.3 Offline Placement Management . . . . . . . . . . . . . . . 9 2.1.4 Online Management . . . . . . . . . . . . . . . . . . . . . 11 2.1.5 Hybrid Placement Management . . . . . . . . . . . . . . . 13 3 Setups 14 3.1 Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Compared Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Experimental Results 17 4.1 Online Management Evaluation . . . . . . . . . . . . . . . . . . . 17 4.2 Remote Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Global Eviction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.5 Offline Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Related Work 24 6 Conclusions 26 References 27

    [1] Abad, C. L., Roberts, N., Lu, Y., and Campbell, R. H. A storage-centric anal-
    ysis of mapreduce workloads: File popularity, temporal locality and arrival
    patterns. In 2012 IEEE International Symposium on Workload Characteriza-
    tion (IISWC) (2012), pp. 100–109.
    [2] Aho, A. V., Denning, P. J., and Ullman, J. D. Principles of optimal page re-
    placement. Journal of the ACM (JACM) 18, 1 (1971), 80–93.
    [3] Ananthanarayanan, G., Ghodsi, A., Warfield, A., Borthakur, D., Kandula, S.,
    Shenker, S., and Stoica, I. Pacman: Coordinated memory caching for parallel
    jobs. In NSDI (Apr. 2012).
    [4] Asad, O., and Kemme, B. Adaptcache: Adaptive data partitioning and mi-
    gration for distributed object caches. In Proceedings of the 17th International
    Middleware Conference (2016), pp. 1–13.
    [5] Borst, S., Gupta, V., and Walid, A. Distributed caching algorithms for content
    distribution networks. In INFOCOM (2010), pp. 1–9.
    [6] Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. Web caching and
    zipf-like distributions: Evidence and implications. In IEEE INFOCOM (1999),
    vol. 1, pp. 126–134.
    [7] Busari, M., and Williamson, C. Prowgen: a synthetic workload generation
    tool for simulation evaluation of web proxy caches. Computer Networks 38, 6
    (2002), 779–794.
    [8] Chou, J., Wu, K., et al. Fastquery: A parallel indexing system for scientific
    data. In IEEE CLUSTER (2011), pp. 455–464.
    [9] Cidon, A., Eisenman, A., Alizadeh, M., and Katti, S. Cliffhanger: Scaling
    performance cliffs in web memory caches. In NSDI (Mar. 2016), pp. 379–392.
    [10] Dahlin, M. D., Wang, R. Y., Anderson, T. E., and Patterson, D. A. Cooperative
    caching: Using remote client memory to improve file system performance. In
    OSDI (Nov. 1994).
    [11] Dean, J., and Ghemawat, S. Mapreduce: simplified data processing on large
    clusters. Communications of the ACM 51, 1 (2008), 107–113.
    [12] Hadoop. https://hadoop.apache.org
    [13] Hu, X., Wang, X., Li, Y., Zhou, L., Luo, Y., Ding, C., Jiang, S., and Wang, Z.
    LAMA: Optimized locality-aware memory allocation for key-value cache. In
    USENIX ATC (July 2015), pp. 57–69.
    [14] Jiang, S., Chen, F., and Zhang, X. Clock-pro: An effective improvement of
    the clock replacement. In USENIX ATC (2005), pp. 323–336.
    [15] Jiang, S., Petrini, F., Ding, X., and Zhang, X. A locality-aware cooperative
    cache management protocol to improve network file system performance. In
    ICDCS (2006), pp. 42–42.
    [16] Jiang, S., and Zhang, X. Lirs: An efficient low inter-reference recency set
    replacement policy to improve buffer cache performance. ACM SIGMETRICS
    Performance Evaluation Review 30, 1 (2002), 31–42.
    [17] Li, B., Yan, B., and Li, H. An overview of in-memory processing with emerg-
    ing non-volatile memory for data-intensive applications. In Great Lakes Sym-
    posium on VLSI (2019), p. 381–386.
    [18] Li, H., Ghodsi, A., Zaharia, M., Shenker, S., and Stoica, I. Tachyon: Reliable,
    memory speed storage for cluster computing frameworks. In SOCC (2014),
    pp. 1–15.
    [19] Memcached. https://memcached.org/.
    [20] O’neil, E. J., O’neil, P. E., and Weikum, G. The lru-k page replacement algo-
    rithm for database disk buffering. Acm Sigmod Record 22, 2 (1993), 297–306.
    [21] Optimizer, I. I. C. https://www.ibm.com/analytics/cplex-optimizer.
    [22] Podlipnig, S., and Böszörmenyi, L. A survey of web cache replacement strate-
    gies. ACM Computing Surveys (CSUR) 35, 4 (2003), 374–398.
    [23] Redis. https://redis.io/.
    [24] Sarkar, P., and Hartman, J. H. Hint-based cooperative caching. ACM Trans.
    Comput. Syst. 18, 4 (nov 2000), 387–419.
    [25] Sergeev, A., and Del Balso, M. Horovod: fast and easy distributed deep learn-
    ing in tensorflow. arXiv preprint arXiv:1802.05799 (2018).
    [26] Shvachko, K., Kuang, H., Radia, S., and Chansler, R. The hadoop distributed
    file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage
    Systems and Technologies (2010), p. 1–10.
    [27] Spark. https://hadoop.apache.org/.
    [28] Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H.
    Chord: A scalable peer-to-peer lookup service for internet applications. SIG-
    COMM Computer Communication Review 31, 4 (2001), 149–160
    [29] Wan, L., Huebl, A., Gu, J., Poeschel, F., Gainaru, A., Wang, R., Chen, J.,
    Liang, X., Ganyushin, D., Munson, T., Foster, I., Vay, J., Podhorszki, N.,
    Wu, K., and Klasky, S. Improving i/o performance for exascale applications
    through online data layout reorganization. IEEE Transactions on Parallel &
    Distributed Systems 33, 04 (apr 2022), 878–890.
    [30] Yu, Y., Wang, W., Zhang, J., and Ben Letaief, K. Lrc: Dependency-aware
    cache management for data analytics clusters. In IEEE INFOCOM 2017 -
    IEEE Conference on Computer Communications (2017), pp. 1–9.
    [31] Yu, Y., Zhang, C., Wang, W., Zhang, J., and Letaief, K. Towards dependency-
    aware cache management for data analytics applications. IEEE Transactions
    on Cloud Computing (2019).

    QR CODE