研究生: |
陳增益 Chen, Tseng-Yi |
---|---|
論文名稱: |
基於新式經濟評估模型的節能、可靠儲存機制暨相關工具設計用於資料密集典藏系統之研究 Based on a Novel Economic Evaluation Model to Design an Energy-efficient and Reliable Storage Mechanism with Associated Tools for Data-intensive Archive System |
指導教授: |
石維寬
Shih, Wei-Kuan |
口試委員: |
陳俊良
Chen, Jiann-Liang 郭大維 Kuo, Tei-Wei 呂政修 Leu, Jenq-Shiou 衛信文 Wei, Hsin-Wen 徐讚昇 Hsu, Tsan-sheng 徐正炘 Hsu, Cheng-Hsin 黃能富 Huang, Nen-Fu 張原豪 Chang, Yuan-Hao |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 57 |
中文關鍵詞: | 綠能資料中心 、節能儲存系統 、評估模型 、可靠度 、儲存模擬工具 |
外文關鍵詞: | Green data centers, energy-efficient system, reliability, economic evaluation, simulation tool |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
巨量資料是一個愈來愈重要的議題,隨著行動裝置發展、網路技術普及,許多資料都由終端裝置產生經由網路保存至遠端的雲端儲存空間,然而一般的雲端儲存服務提供者除了保存使用者上傳的資料之外,也須要利用一些資料容錯機制,如:獨立磁盤冗餘數組,來產生冗餘資料來提昇資料安全性,以此降低資料毀壞時發生資料無法救回的機率,因此,雲端儲存服務系統的儲存空間通常是由大量的低成本、高耗電的硬碟儲存裝置所組成,根據之前的研究發現,電力成本大約就佔了雲端儲存服務提供者50%的營運成本,進一步分析研究,發現儲存系統就佔了整體雲端儲存電力的27%,因此,綠能資料中心成為一個在設計儲存系統時需考量到非常重要的一個議題。為了降低儲存系統的電力消耗,有許多節能的儲存方式提出,而提出的解決方法可以分成兩大類:1. 動態關閉閒置硬態,當有存取時再喚醒。2. 利用高轉速與低轉速的硬碟來配置儲存系統,根據資料的特性將配置在不同轉速的硬碟。以上兩種方式,第一種方式會造成頻繁的開關硬碟,而造成硬碟的硬體壽命減短進而造成硬體替換成本的提升,然而之前提出相關解法都沒有提供這個問題,另一種方式,雖然會有較好的硬體可靠度,因為不透過硬碟開關來省電,但會有較差的省電效果。
整合以上兩種設計觀點,本研究提出一種新的評估方式(E3SaRC),此方法可以同時考量節能機制在儲存系統上的省電效益以及對於硬體成本的衝擊,在這個方法的評量下,每一個節能儲存系統的設計都會以經濟角度重新被思考,因為設計節能儲存系統不再是只追求省電為目標,而要同時顧慮到省電後所造成硬體成本的衝擊,除了新的評估模型,本論文根據研究的初衷,設計了一個儲存機制(CacheRAID)是同時考量到節能與實作節能系統後對系統造成的成本衝擊議題,此設計利用固態硬碟當系統的寫入緩衝,並根據使用者存取資料的特性來在緩衝的固態硬碟來進行資料關聯性擺放,來產生大量的循序存取,藉此來發揮系統最大存取效能,也能讓系統的閒置硬碟有更多的休息時間,考量到硬碟狀態切換的成本問題,我們也整合了一個硬碟狀態切換控制機制在提出的架構當中,來減少硬體成本的衝擊,最後,為了要提升本研究系統的效能,本論文亦提出了一個模擬工具,此模擬工具可以模擬儲存系統的耗電行為並將節能功能模組化,讓使用者可以在模組中實作自己的節能方法,藉以快速評估其設計方法在儲存系統上的功效,此模擬工具也進階的為我們提出的儲存機制找出最佳設定,以此來提升設計系統的效能與可靠度。
在論文的最後,我們利用了實機的實驗來評估本論文所提出的系統之能力,在實驗的部份,我們執行了兩組實際的儲存系統測資,分別是中央研究院的數位典藏系統與佛羅里達國際大學檔案系統,根據實驗的結果發現,我們所提出的節能儲存機制是唯一一個可以在新的經濟評估模型下可以省下儲存系統成本的方法,其他的比較方法因為沒有考量到硬體成本的衝擊,因此會造成較差的結果,此外,根據實驗結果,本論文的方法也省下了儲存系統65%以上的電力消耗,更進階的,在模擬工具上本論文提出的方法也接近實機的電力消耗,兩相比較,模擬工具的誤差率也只有2.5%左右,根據實驗結果,本論文提出了一個完善的節能系統可以兼顧節能與系統可靠度,足夠降低節能系統對硬體成本的衝擊。
Recently, a green data center issue has garnered much attention due to the dramatic growth of data in every conceivable industry and application. With high network bandwidth, mobile applications and user clients always backups program/user data in remote data centers. In addition to the data from users, a data center usually employs a data fault-tolerance mechanism to generate redundant data, so as to keep user data from getting lost/error. To preserve numerous data in data centers, a storage system consumes about 27%-35% of the power consumption in a typical data center. Reducing the energy consumption of storage systems, previous studies conserved power in their respective storage systems by switching idle disks to standby/sleep modes. According to research conducted by Google and the IDEMA standard, frequently setting the disk status to standby mode will increase the disk's Annual Failure Rate and reduce its lifespan. However, in most cases, the authors did not analyze the reliability of their solutions. To address the issue, we propose an evaluation function called E3SaRC (Economic Evaluation of Energy saving with Reliability Constraint), which comprehensively evaluates the effects of a energy-efficient solution by considering the cost of hardware failure when applying energy saving schemes.
With system reliability and energy-efficient considerations, this study proposes an energy-efficient and reliable storage system that is composed of an energy-efficient storage scheme with a data fault-tolerance algorithm, an adaptive simulation tool and a monitor framework. First of all, because power consumption is the most important issue in this dissertation, we developed a data placement mechanism called CacheRAID based on a Redundant Array of Independent Disks (RAID-5) architecture to mitigate the random access problems that implicitly exist in RAID techniques and thereby reduce the energy consumption of RAID disks. On system reliability issue, CacheRAID applies a control mechanism to the spin-down algorithm. To further enhance system energy-efficiency of the proposed system, an adaptive simulation tool has been proposed to find the best system parameters for CacheRAID by quickly simulating the current workload on storage systems.
At the end, the contributions of this dissertation are presented in two parts. In the first part, our experimental results show that the proposed storage system can reduce the power consumption of the conventional software RAID 5 system by 65-80%. Moreover, according to the E3SaRC measurement, the overall saved cost of CacheRAID, is the largest among the systems that we compared. Second, the analytical results demonstrate that the measurement error of the proposed simulation tool is 2.5% lower than that achieved in real-world experiments involving energy estimation experiments. Therefore, the proposed tool can accurately simulate the power consumption of a storage system under different system settings. According to the experimental results, the proposed system can significantly reduce storage system power consumption and increase the system reliability.
[1] I. F. Adams, M. W. Storer, and E. L. Miller. Analysis of workload behavior in scientific and historical long-term data repositories. Trans. Storage, 8(2):6:1–6:27, May 2012.
[2] J. S. Bucy, J. Schindler, S. W. Schlosser, G. R. Ganger, and Contributors. The DiskSim Simulation Environment Version 4.0 Reference Manual. Technical Report CMU-PDL-08-101, Parallel Data Laboratory, Carnegie Mellon University, Pittsburgh, PA, May 2008.
[3] T.-Y. Chen, H.-W.Wei, Y.-J. Chen, T.-S. Hsu, andW.-K. Shih. Base: Benchmark analysis software for energy-efficient solutions in large-scale storage systems. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1–5, Sept 2013.
[4] T.-Y. Chen, H.-W. Wei, T.-T. Yeh, T.-S. Hsu, and W.-K. Shih. An energy-efficient and reliable storage mechanism for data-intensive academic archive systems. Trans. Storage, 11(2):10:1–10:21, Mar. 2015.
[5] T.-Y. Chen, H.-L. Yeh, H.-W. Wei, M.-j. Sun, T.-s. Hsu, and W.-K. Shih. An effective monitoring framework and user interface design. Software: Practice and Experience, pages n/a–n/a, 2014.
[6] T.-Y. Chen, T.-T. Yeh, H.-W. Wei, Y.-X. Fang, W.-K. Shih, and T.-s. Hsu. Cacheraid: An efficient adaptive write cache policy to conserve raid disk array energy. In Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing, UCC ’12, pages 117–124. IEEE Computer Society, 2012.
[7] D. Colarelli and D. Grunwald. Massive arrays of idle disks for storage archives. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, SC’02, pages 1–11. IEEE Computer Society Press, 2002.
[8] G. Cole. Estimating drive reliability in desktop computers and consumer electronics. Segate Technical Report TP-338.1, Seagate Personal Storage Group, Longmont, Colorado, Nov. 2000.
[9] S. corporation. Hard drive datasheet. Online, 2014.
[10] Y. Deng. What is the future of disk drives, death or rebirth? ACM Comput. Surv., 43(3):23:1–23:27, Apr. 2011.
[11] J. F. Gantz, C. Chute, A. Manfrediz, S. Minton, D. Reinsel, W. Schlichting, and A. Toncheva. An updated forecast of worldwide information growth through 2011, Mar. 2008.
[12] R. Garg, S. W. Son, M. Kandemir, P. Raghavan, and R. Prabhakar. Markov model based disk power management for data intensive workloads. In Cluster Computing and the Grid, 2009. CCGRID ’09. 9th IEEE/ACM International Symposium on, pages 76–83, May 2009.
[13] S. Hikida, H. H. Le, and H. Yokota. A power saving storage method that considers individual disk rotation. In Proceedings of the 17th International Conference on Database Systems for Advanced Applications - Volume Part II, DASFAA’12, pages 138–149. Springer-Verlag, 2012.
[14] I. D. D. E. . M. A. (IDEMA). Specification of Hard Disk Drive Reliability. IDEMA standard R2-98, 1998.
[15] IETF. Internet small computer systems interface. Online, 2000.
[16] F. Inc. Flashcache. Online, 2012.
[17] IOzone. Iozone filesystem benchmark. Online, 2006.
[18] N. Joukov and J. Sipek. Greenfs: Making enterprise computers greener by protecting them better. SIGOPS Oper. Syst. Rev., 42(4):69–80, Apr. 2008.
[19] R. T. Kaushik and M. Bhandarkar. Greenhdfs: Towards an energy-conserving, storage-efficient, hybrid hadoop compute cluster. In Proceedings of the 2010 International Conference on Power Aware Computing and Systems, HotPower’10, pages 1–9. USENIX Association, 2010.
[20] I. Labs. Hdd diet: Power consumption and heat dissipation. Online, 2005.
[21] X. Li, M. Lillibridge, and M. Uysal. Reliability analysis of deduplicated and erasure-coded storage. SIGMETRICS Perform. Eval. Rev., 38(3):4–9, Jan.
2011.
[22] L. Lu, P. Varman, and J. Wang. Diskgroup: Energy efficient disk layout for raid1 systems. In Networking, Architecture, and Storage, 2007. NAS 2007.
International Conference on, pages 233–242, July 2007.
[23] D. Molaro, H. Payer, and D. Le Moal. Tempo: Disk drive power consumption characterization and modeling. In Consumer Electronics, 2009. ISCE ’09. IEEE 13th International Symposium on, pages 246–250, May 2009.
[24] D. Narayanan, A. Donnelly, and A. Rowstron. Write off-loading: Practical power management for enterprise storage. Trans. Storage, 4(3):10:1–10:23, Nov. 2008.
[25] J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazi`eres, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramclouds: Scalable high-performance storage entirely in dram. SIGOPS Oper. Syst. Rev.,
43(4):92–105, Jan. 2010.
[26] E. Pinheiro and R. Bianchini. Energy conservation techniques for disk array-based servers. In Proceedings of the 18th Annual International Conference on Supercomputing, ICS ’04, pages 68–78. ACM, 2004.
[27] E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive population. In Proceedings of the 5th USENIX Conference on File and Storage Technologies, FAST ’07, pages 2–2. USENIX Association, 2007.
[28] B. Schroeder and G. A. Gibson. Disk failures in the real world: What does an mttf of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies, FAST ’07. USENIX Association, 2007.
[29] A. Sinica. Taiwan e-learning and digital archives program. Online, 2008.
[30] G. Soundararajan, V. Prabhakaran, M. Balakrishnan, and T. Wobber. Extending ssd lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST’10, pages 8–8. USENIX Association, 2010.
[31] M. W. Storer, K. M. Greenan, E. L. Miller, and K. Voruganti. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies
(FAST 2008), pages 1–16, San Jose, CA, USA, Feb. 2008.
[32] G. W. Tyndall. Role of the head disk interface in hod reliability. Online, 2010.
[33] V. Vasudevan, D. G. Andersen, M. Kaminsky, J. Franklin, M. A. Kozuch, I. Moraru, P. Pillai, and L. Tan. Challenges and opportunities for efficient computing with fawn. SIGOPS Oper. Syst. Rev., 45(1):34–44, Feb. 2011.
[34] A. Verma, R. Koller, L. Useche, and R. Rangaswami. Srcmap: Energy proportional storage using dynamic consolidation. In Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST’10, pages 20–20. USENIX Association, 2010.
[35] A.-I. A. Wang, G. Kuenning, P. Reiher, and G. Popek. The conquest file system: Better performance through a disk/persistent-ram hybrid design. Trans. Storage, 2(3):309–348, Aug. 2006.
[36] J. Wang, H. Zhu, and D. Li. eraid: Conserving energy in conventional disk-based raid system. IEEE Trans. Comput., 57(3):359–374, Mar. 2008.
[37] C. Weddle, M. Oldham, J. Qian, A.-I. A. Wang, P. Reiher, and G. Kuenning. Paraid: A gear-shifting power-aware raid. Trans. Storage, 3(3), Oct. 2007.
[38] L. Xiao, T. Yu-An, and S. Zhizhuo. Semi-raid: A reliable energy-aware raid data layout for sequential data access. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST ’11, pages 1–
11. IEEE Computer Society, 2011.
[39] T. Xie. Sea: A striping-based energy-aware strategy for data placement in raid-structured storage systems. IEEE Trans. Comput., 57(6):748–761, June 2008.
[40] T. Xie and Y. Sun. Understanding the relationship between energy conservation and reliability in parallel disk arrays. J. Parallel Distrib. Comput., 71(2):198–210, Feb. 2011.
[41] X. Xu, K. Teramoto, A. Morales, and H. Huang. Dual: Reliability-aware power management in data centers. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, CCGrid ’13, pages 530–537. IEEE Computer Society, May 2013.
[42] T.-T. Yeh, H.-W. Wei, S.-H. Liu, P.-C. Huang, T. sheng Hsu, and Y.-C. Chen. The development of digital archives management tools for irods. In Proceedings iRODS User Group Meeting, 2010.
[43] S. Yin, X. Ruan, A. Manzanares, and X. Qin. How reliable are parallel disk systems when energy-saving schemes are involved? In Cluster Computing and
Workshops, 2009. CLUSTER ’09. IEEE International Conference on, pages 1–9. IEEE Computer Society, Aug 2009.
[44] J. Zedlewski, S. Sobti, N. Garg, F. Zheng, A. Krishnamurthy, and R. Wang.