研究生: |
洪鼎詠 Ding-Yong Hong |
---|---|
論文名稱: |
InfiniBand上一個有效率處理不連續輸入輸出的MPI-IO An Efficient MPI-IO for Noncontiguous I/O over InifiniBand |
指導教授: |
鍾葉青
Yeh-Ching Chung |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 英文 |
論文頁數: | 26 |
中文關鍵詞: | 收集型輸入輸出 、InfiniBand 、遠端直接記憶體存取 、MPI-IO 、資料型態 |
外文關鍵詞: | Collective I/O, InfiniBand, RDMA, MPI-IO, datatype |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在許多科學應用程式中,不連續的輸入輸出存取是主要的存取模式。在叢集系統中,MPI-IO使用的收集型輸入輸出(Collective I/O)提供了一種解決存取不連續資料片段的方法。Collective I/O又稱為兩階段輸入輸出(two phase I/O),在其中一個階段,不同的I/O客戶端彼此互相傳遞其擁有的不連續資料的資訊,再將不連續的資料片段在客戶端間重新傳送及分配;另一個階段,I/O客戶端再將連續的資料讀出或寫入I/O伺服器。在這種兩階段的方法下,有許多相同的資料必須經過兩次以上網路的傳輸,這種重複傳輸的動作除了需要花費高的負擔之外,也對網路造成額外的流量,在輸入輸出密集的程式下,這種情形將造成網路的負擔。
在這篇論文中,我們延伸了原本collective I/O的設計方式,並提出了一種新的輸入輸出方式,利用InfiniBand硬體支援的RDMA Gather/Scatter,避免了Collective-I/O需要兩次資料傳輸的問題,而只需要一次資料的傳輸。並且將MPI-IO datatype以及view的概念加入平行化檔案系統來完成這次的設計。實驗證明我們的設計比現存的設計更有效率,效能也更好。
Noncontiguous data access is a very common access pattern in many scientific applications. Using POSIX I/O to access many pieces of noncontiguous data segments will generate a lot of amount of I/O requests that cause the I/O system perform poorly. Tow-phase I/O, also called collective I/O, applied by MPI-IO provides a good method to optimize noncontiguous I/O operations. In two-phase I/O, the collections of independent I/O operations that make up the collective operation are analyzed to determine what data segments must be redistributed and transferred. This redistribution needs some information been calculated in advance and lots of data segments are transmitted via network more than once. With I/O-intensive applications, the aggregate size of these data segments being redistributed becomes significant large. This situation will consume much more CPU time to compute redistributing information and network resources to send the data segments. This additional overhead will degrade the system performance.
In this thesis, we extend the collective I/O method and propose a new I/O scheme to avoid re-transmission of data segments by applying RDMA Gather/Scatter operations supported by InfiniBand hardware. We also extend the “view” and “datatype” concepts of MPI into the file system to help complete our design. The experiments show that the method we design improves the performance and is more efficient than the collective I/O approach.
[1] L. Amar, A. Barak and A. Shiloh, “The MOSIX Parallel I/O System for Scalable I/O Performance.” Proc. 14-th IASTED International Conference on Parallel and Distributed Computing and Systems, pp. 495-500, Cambridge, MA, Nov. 2002.
[2] A. Ching, A. Choudhary, K. Coloma, W.K. Liao, R. Ross, W. Gropp. “Noncontiguous I/O Accesses Through MPI-IO.” Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003.
[3] P.M. Dickens, R. Thakur, “Improving Collective I/O Performance Using Threads”, 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, 1999.
[4] F. Garcia, A. Calderon, J. Carretero, J. M. Perez, and J. Fernandez, “A Parallel and Fault Tolerant File System Based on NFS Servers”, Proceedings of the Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003.
[5] InfiniBand™ Trade Association, InfiniBand™ Architecture Specification Volume 1, Release 1.1, November 2002.
[6] M. Kandemir, “Compiler-Directed Collective I/O”, IEEE Transaction on Parallel and Distributed System, 2001.
[7] X. Ma, X. Tiao, M. Campbell, M. Winslett, “Flexible and Efficient Parallel I/O for Large-Scale Multi-component Simulation”, Proceedings of the International Parallel and Distributed Processing Symposium, 2003.
[8] Mellanox Technologies. Mellanox IB-Verbs API (VAPI).
[9] MVPICH, MPICH 1.2.5 implementation using IB for the interconnection network
[10] J. Nieplocha, H. Dachsel, I. Foster, “Distant I/O:One-Sided Access to Secondary Storage on Remote Processors.” The Seventh IEEE International Symposium on High Performance Distributed Computing, 1998.
[11] PVFS2, Parallel Virtual File System 2. http://www.pvfs.org/pvfs2/
[12] J. M. del Rosario, R. Bordawekar, and A. Choudhary. “Improved parallel I/O via a two-phase run-time access strategy.” Proceedings of the IPPS ’93 Workshop on Input/Output in Parallel Computer Systems, pages 56–70, Newport Beach, CA, 1993.
[13] X. Shen, A. Choudhary, “A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing.” Ninth IEEE International Symposium on High Performance Distributed Computing, August 2000.