簡易檢索 / 詳目顯示

研究生: 袁帥
論文名稱: Accelerate Reed-Solomon Codes on GPUs
應用GPGPU加速Reed-Solomon Erasure Code的編解碼
指導教授: 周志遠
口試委員: 李哲榮
林俊淵
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 47
中文關鍵詞: 里德-所羅門碼抹除碼通用圖形處理器
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Reed-Solomon Codes是一種在cloud storage system中被廣泛使用的redundancy solution。與replication這一傳統的redundancy solution相比,它在保證系統fault tolerance的同時,又能有效降低storage overhead。然而,Reed-Solomon Codes的編解碼複雜程度高,需要消耗大量的運算時間。在這篇論文中,我們採用GPU作爲accelerator,並探討了一些利用GPU來加速Reed-Solomon Codes編解碼的技巧。我們也用CUDA完成了GPU版本的Reed-Solomon Codes的實作,並對它的performance進行evaluate。作爲比較,我們也在Intel Xeon CPU上測試我們目前所知的最佳CPU實作——Jerasure的performance,最終,我們優化後的GPU版本可以獲得14倍以上的加速比。


    1 Introduction 1 2 Related Works 3 3 Background 5 3.1 Reed-Solomon Coding Mechanism . . . . . . . . . . . . . . . . . . . . 5 3.2 Brief Introduction of Galois Field . . . . . . . . . . . . . . . . . . . . 6 4 Accelerating Operations over Galois Field 8 4.1 GPU Implementation: Loop-based or Table-based? . . . . . . . . . . 8 4.1.1 Overview of the Loop-based Method . . . . . . . . . . . . . . 8 4.1.2 Overview of the Table-based Methods . . . . . . . . . . . . . . 9 4.1.3 Comparison between the Loop-based and the Log&exp Table- based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 Further Improvement of the Log&exp Table-based Method . . . . . . 13 5 Accelerating Matrix Multiplication 17 5.1 Square-Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.2 Generalized Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . 19 5.3 Further Improvement of Tiling Algorithm . . . . . . . . . . . . . . . . 21 6 Accelerating Decoding Matrix Generation 23 7 Reducing Data Transfer Overhead 26 7.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . . . . . 26 7.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . . . . . 28 8 Experiment 31 8.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 8.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 8.3 Overall Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 32 8.3.1 Step-by-step Improvement . . . . . . . . . . . . . . . . . . . . 33 8.3.2 GPU vs. CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 8.4 Accelerating Operations over Galois Field . . . . . . . . . . . . . . . 34 8.4.1 GPU Implementation: Loop-based or Table-based? . . . . . . 34 8.4.2 Further Improvement of the Log&exp Table-based Method . . 35 8.5 Accelerating Matrix Multiplication . . . . . . . . . . . . . . . . . . . 36 8.6 Reducing Data Transfer Overhead . . . . . . . . . . . . . . . . . . . . 39 8.6.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . 39 8.6.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . 40 9 Conclusion 42

    [1] D. Borthakur, R. Schmidt, R. Vadali, S. Chen, and P. Kling. Hdfs raid. In Hadoop User Group Meeting, 2010.
    [2] X. Chu and K. Zhao. Practical random linear network coding on gpus. In GPU Solutions to Multi-scale Problems in Science and Engineering, pages 115-130. Springer, 2013.
    [3] C. Cuda. Programming guide. NVIDIA Corporation (July 2012), 2012.
    [4] M. L. Curry, A. Skjellum, H. L. Ward, and R. Brightwell. Accelerating reed-solomon coding in raid systems with gpus. In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages 1-6. IEEE, 2008.
    [5] A. Fikes. Storage architecture and challenges. Talk at the Google Faculty Summit, 2010.
    [6] D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In OSDI, pages 61{74, 2010.
    [7] S. Ghemawat, H. Gobioff, and S.-T. Leung. The google File system. In ACM SIGOPS Operating Systems Review, volume 37, pages 29-43. ACM, 2003.
    [8] B. J. Gimmestad. The russian peasant multiplication algorithm: A generalization. The Mathematical Gazette, 75(472):169-171, 1991.
    [9] K. M. Greenan, E. L. Miller, and T. J. Schwarz. Optimizing galois Field arithmetic for diverse processor architectures and applications. In Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. MAS-COTS 2008. IEEE International Symposium on, pages 1-10. IEEE, 2008.
    [10] C. Huang and L. Xu. Fast software implementation of Finite Field operations. Technical report, Citeseer, 2003.
    [11] S. Kalcher and V. Lindenstruth. Accelerating galois Field arithmetic for reed-solomon erasure codes in storage applications. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 290-298. IEEE, 2011.
    [12] C. NVidia. C best practices guide. NVIDIA, Santa Clara, CA, 2012.
    [13] J. S. Plank, S. Simmerman, and C. D. Schuman. Jerasure: A library in c/c++ facilitating erasure coding for storage applications-version 1.2. University of Tennessee, Tech. Rep. CS-08-627, 23, 2008.
    [14] J. S. Plank and L. Xu. Optimizing cauchy reed-solomon codes for fault-tolerant network storage applications. In Network Computing and Applications, 2006. NCA 2006. Fifth IEEE International Symposium on, pages 173-180. IEEE, 2006.
    [15] I. Reed and G. Solomon. Polynomial codes over certain Finite Fields. Journal of the Society for Industrial & Applied Mathematics, 8(2):300-304, 1960.
    [16] T. S. Schwarz and E. L. Miller. Store, forget, and check: Using algebraic signatures to check remotely administered storage. In Distributed Computing Systems, 2006. ICDCS 2006. 26th IEEE International Conference on, pages 12-12. IEEE, 2006.
    [17] H. Shojania and B. Li. Pushing the envelope: Extreme network coding on the gpu. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on, pages 490-499. IEEE, 2009.
    [18] H. Shojania, B. Li, and X. Wang. Nuclei: Gpu-accelerated many-core network coding. In INFOCOM 2009, IEEE, pages 459-467. IEEE, 2009.
    [19] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed File system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pages 1-10. IEEE, 2010.
    [20] W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news, 23(1):20-24, 1995.4

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE