簡易檢索 / 詳目顯示

研究生: 林育存
Lin, Yu Tsuen
論文名稱: 利用資料分割、壓縮、重傳、平行與動態調整之 技術最佳化雲儲存資料傳輸效能
Maximize cloud storage data transfer throughput through data partition, compression, re-transmission, parallelism and adaptation
指導教授: 周志遠
Chou, Chi Yuan
口試委員: 李哲榮
許慶賢
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 35
中文關鍵詞: 自動調校吞吐量壓縮雲端儲存器裝箱問題
外文關鍵詞: throughput, cloud storage, AWS, bin-packing
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 越來越多的企業或是個人使用者將他們的資料儲存到雲端儲存 器上,如何能更快的完成上傳是一個重要的議題,上傳速度將會隨著 不同硬體設備,例如網路頻寬(bandwidth)、記憶體...而不同,在此篇 論文中,我們將會透過實驗來說明這些會造成較低上傳速度的問題, 根據這些問題,我們將會制定相對應的策略來解決並且實做一個名為 快速雲端上傳的工具,我們的工具在上傳的過程中可以自動調整相關 參數的設定,以求能用最快的速度上傳。我們也會討論在上傳時遇到 斷線會造成什麼後果並找出利於上傳的最佳檔案大小。與其他現有工 具的比較上,在不同資料類別的實驗中可以有 3 到 85 倍的效能提昇, 在不同延遲的上傳位置的實驗中可以達到 15 到 717 倍的效能提昇。


    Nowadays the number of enterprises and users stored their data in cloud storage in- creased dramatically. How to speed up uploading is an important issue. Uploading speed will be variant with different machine specification,network bandwidth, data type, etc. In this thesis, we will address the lower uploading throughput problems by experiments . According to these problems ,we make policies to solve correspond- ing problems and implement the tool named Fast Cloud Transfer Tool. Our tool can automatically reconfigure the upload setting during the uploading procedure to achieve expected throughput. We also discuss how the failure will impact on uploading throughput and find optimal part size. Our evaluations on different data type can achieve x3 to x85 speedup, on different latency destination can achieve x15 to x717 speedup compared to other existing tool.

    1 Introduction 4 2 Motivation 6 2.1 EnvironmentonTest-bed......................... 6 2.2 Clientsideunderutilization ....................... 7 2.3 Serversidecongestion .......................... 8 2.4 Limitednetworkbandwidth ....................... 9 2.5 Failure................................... 10 3 Approach 13 3.1 Concurrencymodel............................ 13 3.2 Bin-packing ................................ 15 3.3 Timeoutdecision ............................. 16 3.4 Compressiondecisionmodel ....................... 17 3.5 Partsizeanalysis ............................. 18 3.6 Dynamicadaptive............................. 20 4 Software Overview 21 4.1 Networkprofiling ............................. 21 4.2 Samplingandbin-packing ........................ 22 4.3 UploadcontrolandThroughputmonitor ................ 23 5 Evaluation 24 5.1 ExperimentalSetup............................ 24 5.2 Experiment1-Datatype......................... 25 5.3 Experiment2-Latency ......................... 26 5.4 Experiment3-Usagesofcloudstorage................. 28 6 Related work 30 7 Conclusion 32

    [1] Amazon ec2. https://aws.amazon.com/tw/ec2.
    [2] Amazon s3. https://aws.amazon.com/s3.
    [3] Boto : Aws python sdk. http://boto.cloudhackers.com/en/latest/.
    [4] Google drive. https://www.google.com/intl/zh-TW/drive/.
    [5] Microsoft azure. https://azure.microsoft.com.
    [6] Rackspace. https://www.rackspace.com.
    [7] s3-parallel-iput. https://github.com/mishudark/s3-parallel-put.
    [8] s3cmdi. https://github.com/s3tools/s3cmd.
    [9] speedtest-cli. https://github.com/sivel/speedtest-cli.
    [10] Alsabti, K., Ranka, S., and Singh, V. A one-pass algorithm for accu- rately estimating quantiles for disk-resident data. In Proceedings of the 23rd International Conference on Very Large Data Bases (San Francisco, CA, USA, 1997), VLDB ’97, Morgan Kaufmann Publishers Inc., pp. 346–355.
    [11] Hacker, T. J., Athey, B. D., and Noble, B. The end-to-end performance effects of parallel tcp sockets on a lossy wide-area network. In Parallel and Distributed Processing Symposium., Proceedings International, IPDPS 2002, Abstracts and CD-ROM (April 2002), pp. 10 pp–.
    [12] Jain, R., and Chlamtac, I. The p 2 algorithm for dynamic calculation of quantiles and histograms without storing observations. Communications of the ACM 28, 10 (1985), 1076–1085.
    [13] Lu, D., Qiao, Y., Dinda, P. A., and Bustamante, F. E. Modeling and taming parallel tcp on the wide area network. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) - Pa- pers - Volume 01 (Washington, DC, USA, 2005), IPDPS ’05, IEEE Computer Society, pp. 68.2–.
    [14] man Jr, E. C., Garey, M., and Johnson, D. Approximation algorithms for bin packing: A survey. Approximation Algorithms for NP-Hard Problems (1996), 46–93.
    [15] Mathis, M., Semke, J., Mahdavi, J., and Ott, T. The macroscopic be- havior of the tcp congestion avoidance algorithm. ACM SIGCOMM Computer Communication Review 27, 3 (1997), 67–82.
    [16] McDermott, J. P., Babu, G. J., Liechty, J. C., and Lin, D. K. J. Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation. Statistics and Computing 17, 4 (2007), 311–321.
    [17] Raatikainen, K. E. Simultaneous estimation of several percentiles. Simula- tion 49, 4 (1987), 159–163.
    [18] Raatikainen, K. E. E. Sequential procedure for simultaneous estimation of several percentiles. ACM Transactions on Modeling and Computer Simulation 3 (1993), 108–133.
    [19] Stevens, W. R. Tcp slow start, congestion avoidance, fast retransmit, and fast recovery algorithms.
    [20] Veiga, J., Taboada, G. L., Pardo, X. C., and Tourino, J. The hps3 service: reduction of cost and transfer time for storing data on clouds. In High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC, CSS, ICESS), 2014 IEEE Intl Conf on (2014), IEEE, pp. 213–220.
    [21] Yildirim, E., Arslan, E., Kim, J., and Kosar, T. Application-level opti- mization of big data transfers through pipelining, parallelism and concurrency. IEEE Transactions on Cloud Computing 4, 1 (Jan 2016), 63–75.
    [22] Yin, D., Yildirim, E., Kulasekaran, S., Ross, B., and Kosar, T. A data throughput prediction and optimization service for widely distributed many-task computing. IEEE Transactions on Parallel and Distributed Systems 22, 6 (June 2011), 899–909.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE