簡易檢索 / 詳目顯示

研究生: 闕宏時
Chueh, Hung-Shih
論文名稱: 負載平衡交換網路之設計與實作
Design and Implementation of Load-balanced Switching Networks
指導教授: 張正尚
Chang, Cheng-Shang
李端興
Lee, Duan-Shin
口試委員: 張寶基
Chang, Pao-Chi
楊啟瑞
Yuang, Maria C.
廖婉君
Liao, Wanjiun
張正尚
Chang, Cheng-Shang
李端興
Lee, Duan-Shin
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 通訊工程研究所
Communications Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 203
中文關鍵詞: 雲端計算負載平衡交換機資料中心
外文關鍵詞: cloud computing, load-balanced switch, data center
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著雲端計算的成熟以及應用的蓬勃發展, 資料中心不但是雲端計算的核心更直接影響到雲端計算的效能。根據網路製造商的建議,資料中心網路使用著fat-tree拓撲串連目前市場主流的input-buffered交換節點來連結資料中心的伺服器。但是,fat-tree拓撲有個特性,那就是所需頻寬隨著資料匯流層級的增加所需的頻寬也呈指數的增加,此外,input-buffered交換機的網路處理器所需的速度也必須提高。因此,資料中心網路的規模或效能將受到局限。
    為了處理上述的問題,我們利用拓撲等效,將fat-tree拓撲視為一個Benes網路。因為,Benes網路可以實現所有的輸入輸出排列(permutation)常被用來實現交換機的交換核心,所以可以進一步將原本的fat-tree拓撲視為一個交換核心。而具擴充性的負載平衡交換機獨特的負載平衡特性,交換機內部的節點可以無視外部的資料流量分佈依事先定義好的交換配對達到100%的交換效能。套用負載平衡交換排程到fat-tree拓撲的交換節點,因為負載平衡特性,內部的交換節點僅需要依據事先定義好的交換配對去作交換,所以不再需要任何的網路處理器。因為負載平衡排程只需要one cycle permutation,我們可以利用位元逆轉排列降低所需頻寬的問題,更進一步將交換網路複雜度由Benes網路降為banyan網路。
    為了實現基礎的交換元件,可據以實現無需網路處理器的fat-tree拓撲,我們針對實務的挑戰,提出相對應的設計方案,並完成線卡和交換核心的Verilog模組開發,並開發了一個十八層印刷電路板作為通用可程控硬體平台。整合線卡和交換核心Verilog模組的可程控電路板到標準AdvancedTCA機櫃,並外加Intel的網路處理器開發平台,我們最終實現一個負載平衡交換機原型。


    As the technology and the application of the cloud computing are more and more mature and popular, data center not only is the core of the cloud computing but directly effects its performance. According to the suggestions from the major network manufactures, data center networks are constructed based on the currently mainstream input-buffered switches interconnected to form fat-tree topology and enables the service provided by the backend servers. But in the fat-tree topology the required link bandwidth is increased exponentially as aggregation level raising. Furthermore, the speed of the network processor in the input-buffered switch must be increased. Therefore, the scale or the performance of the data center network will be limited.
    To handle the above challenges, a fat-tree topology can be treated as a Benes network in terms of topology equivalence. Benes networks, that are able to realize all input-output permutations, have been widely used to construct switch fabrics, and therefore we might view a fat-tree topology as a switch fabric. The unique property of the scalable load-balanced switch achieves 100% throughput by all internal decomposed switching nodes periodically running predefined connection patterns despite traffic distribution of external network. Applying the schedule of the load-balanced switch on the internal switching nodes in a fat-tree topology, we do not need any network processors, because all internal switching nodes switch cells are based on predefined connection patterns. Since the schedule of the load-balanced switch only requires one cycle permutations, we can reduce not only the required bandwidth but also the complexity of the switching network from the Benes network to the banyan network by the bit reversal permutation.
    To realize the fundamental switching device that can be used to construct an NP-free fat-tree topology, we propose several design solutions to meet the practical challenges, complete the Verilog modules of the linecards and the switch fabrics, and develop a 18-layered-PCB and programmable hardware platform. Integrating the programmable PCBs of the Verilog modules of the linecards and the switch fabric respectively into a standard AdvancedTCA chassis and connecting with two external Intel network processor development platforms, we finally complete a prototype of the load-balanced switch.

    中文摘要 I 英文摘要 II 誌謝 IV 目錄 VI 圖片 VIII 表格 XII 第1章 介紹 1 1.1 資料中心網路 1 1.1.1 Fat-tree 2 1.1.2 DCell 3 1.1.3 BCube 4 1.1.4 VL2 5 1.2 動機 6 1.2.1 將資料中心網路視為負載平衡交換機 7 1.2.2 降低所需的頻寬和交換網路的複雜度 7 1.2.3 實現基礎交換雛型 7 1.3 大綱 8 第2章 負載平衡交換網路 11 2.1 重要文獻探討 11 2.1.1 負載平衡交換機之特性 11 2.1.2 Benes網路 14 2.1.3 Fat-tree和Benes為拓撲等效 16 2.2 將Fat-tree拓撲視為交換核心實現無需網路處理器之交換網路 18 第3章 使用位元逆轉排列矩陣降低頻寬需求 20 3.1 拉丁矩陣 22 3.2 位元逆轉排列矩陣 22 3.3 優化頻寬需求 26 3.4 簡化交換網路複雜度 33 第4章 系統挑戰與設計方案 37 4.1 集中式虛擬輸出佇列的複雜度限制了擴充性 37 4.2 傳輸距離變異量和長傳輸延遲限制了交換效能 40 4.3 同步假設限制了從不同線卡到交換核心間傳輸距離的變異量 42 4.4 傳統虛擬輸出佇列管理需要加速 45 4.5 順序錯亂的傳輸令封包重組困難重重 48 4.6 故障的線卡造成永久無法復原的封包重組 49 第5章 系統實作 51 5.1 網路處理器開發平台 52 5.2 線卡和交換核心通用可程控硬體平台 56 5.2.1 直流電源模組 58 5.2.2 FPGA及周邊支援模組 61 5.3 線卡和交換核心功能Verilog模組設計 64 5.4 線卡Verilog功能模組 66 5.4.1 網路處理器CSIX界面控制器 67 5.4.2 虛擬輸出佇列 69 5.4.2.1 佇列管理員 69 5.4.2.2 緩衝管理員 71 5.4.3 交換核心SerDes界面 76 5.5 交換核心功能Verilog模組 77 5.5.1 線卡錯誤偵測模組 79 5.5.2 容錯交換核心 81 5.6 交換系統原型 88 第6章 功能驗證 89 6.1 直流電源模組 90 6.2 負載平衡交換 95 6.3 記憶體系統 98 6.4 網路處理器封包切割 103 6.5 封包追縱 106 6.6 容錯系統 110 6.7 串流應用驗證 114 第7章 結論 117 參考文獻 118 附錄一 電路板組裝BOM 121 附錄二 電路示意圖說明 127 附錄三 電路板堆疊設計 181

    [1]S. Brin and L. Page, “The anatomy of a large-scale hypertexual web search engine,” Computer Networks and ISDN Systems, vol. 30, pp. 107–117, April 1998.
    [2]F. Schmuck and R. Haskin, “GPFS: A shared-disk file system for large computing clusters,” in Proceedings 1st USENIX Conference on File and Storage Technologies (FAST’02), Monterey, CA, USA, January 28–30, 2002.
    [3]S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” ACM SIGOPS Operating Systems Review, vol. 5, pp. 29–43, December 2003.
    [4]G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, “Dynamo: Amazon’s highly available key-value store,” in Proceedings ACM Symposium on Operating Systems Principles (SOSP’07), Stevenson, WA, USA, October 14–17, 2007.
    [5]SGI Developer Central Open Source Linux XFS, “XFS: A high-performance journaling filesystem.” http://oss.sgi.com/projects/xfs/.

    [6]J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters,” in Proceedings 6th USENIX Symposium on Operating Systems Design and Implementation (OSDI’04), San Francisco, CA, USA, December 6–8, 2004.
    [7]M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proceedings ACM Special Interest Group on Data Communication (SIGCOMM’08), Seattle, WA, USA, August 17–22, 2008.
    [8]Emerson Network Power, “Energy Logic: Reducing Data Center Energy Consumption by Creating Savings that Cascade Across Systems”, available at http://www.liebert.com/common/ViewDocument.aspx?id=880.
    [9]Rad Stanojevic and Robert Shorten, “Distributed Dynamic Speed Scaling,” in Proceedings International Symposium on Information Theory (INFOCOM’10).
    [10]Ramya Raghavendra, Parthasarathy Ranganathan, Vanish Talwar, Zhikui Wang, and Xiaoyun Zhu, “No “Power” Struggles: Coordinated Multi-level Power Management for the Data Center,” in Proceedings International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’08).
    [11]C. E. Leiserson, “Fat-Trees: Universal Networks for hardware-efficient supercomputing.,” IEEE Transactions on Computers, vol. 34, pp. 892–901, October 1985.
    [12]C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “DCell: A scalable and fault-tolerant network structure for data centers,” in Proceedings ACM Special Interest Group on Data Communication (SIGCOMM’08), Seattle, WA, USA, August 17–22, 2008.
    [13]C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, “BCube: A high performance, servicecentric network architecture for modular data centers,” in Proceedings ACM Special Interest Group on Data Communication (SIGCOMM’09), Barcelona, Spain, August 17–21, 2009.
    [14]A. Greenberg, J. R. Hamilton, and N. Jain, “VL2: A scalable and flexible data center network,” in Proceedings ACM Special Interest Group on Data Communication (SIGCOMM’09), Barcelona, Spain, August 17–21, 2009.
    [15]“Virtex-Ⅱ Pro and Virtex-Ⅱ Pro X FPGA UserGuide,” Retrieved from http://www.xilinx.com/support/documentation/user_guides/ug012.pdf
    [16]C.-S Chang, D.-S. Lee and Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches, part I: one-stage buffering,” Computer Communications, Vol. 25, pp. 611-622, 2002.
    [17]“Double Data Rate (DDR) SDRAM,” Retrieved from http://cache.micron.com/Protected/expiretime=1307382702;badurl=aHR0cDovL3d3dy5taWNyb24uY29tLy80MDQuaHRtbA==/13adaa55bdd6ee3ceaf5559d6ea50410/1/16/512Mb_DDR.pdf

    [18]J. Duato, S. Yalamanchili and L. Ni, Interconnection Networks: An Engineering Approach”, Morgan Kaufmann (pubs.), 2003
    [19]1.Sapountzis, G., Katevenis, M., "Benes switching fabrics with O(N)-complexity internal backpressure", IEEE Communications Magazine, pp. 88 - 94, Vol. 43, Jan. 2005
    [20]Cisco Sytems, Inc, “Switch Fabric,” Retrieved from
    http://www.cisco.com/en/US/docs/routers/crs/crs1/8_slot/system/description/hq6345_4.pdf

    [21]C. S. Change and D. S. Lee, Principles, Architectures and Mathematical Theories of High Performance Packet Switches, NTHU Academic Press, 2008
    [22]CSIX, CSIX-L1: Common switch interface specification-L1, Retrieved from
    http://www.oiforum.com/public/documents/csixL1.pdf.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE