資料串流管理系統中有效率的具適應性卸載方法

簡易檢索 / 詳目顯示

回結果列表

研究生：	張力元 Li-Yuan Chang
論文名稱：	資料串流管理系統中有效率的具適應性卸載方法 Efficient Approaches for Adaptive Load Shedding in a Data Stream Management System
指導教授：	陳良弼 Arbee L.P. Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2005
畢業學年度：	93
語文別：	英文
論文頁數：	43
中文關鍵詞：	卸載、卸載器、Aurora 、資料串流管理系統
外文關鍵詞：	load shedding, load shedder, Aurora, data stream management system
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

資料串流管理系統(data stream management system)是近年來一個很活躍的研究領域。由於一個資料串流管理系統所擁有的資源是有限的，然而資料串流的流速卻是無法預期，因此，當工作量超過了系統能力後，還能確保系統維持在穩定的效能是很重要的。在之前一個叫做Aurora的計畫中提出了在每單位時間中僅有有限CPU cycles的環境中處理卸載(load shedding)的問題，並提出了一種方法在當資料串流管理系統即將發生過載(overloading)時，插入一些運算子(operator)來負責丟掉一些資料以期降低工作量。而事實上可能會有很多種方案(scheme)來插入這些運算子。在Aurora這個計畫中，他們把所有可能的方案都事先計算出來，並且存成一個查詢表(lookup table)來應付過載的情況。由於資料串流的流速是會變動的，因此這個表也有可能需要被重建。在這篇論文中，我們沿用與Aurora相同的背景(context)，並為快速的卸載提出了具適應性的方法來重建這張表。在我們的方法中，藉由觀察資料串流的流速，我們提出了新穎的方式來決定這張表是否需要被修改。不過，即使這張表需要被修改，裡面有許多方案仍是可以被重複使用的。因此，我們設計的兩個方法都只是去修改這張表裡面的某些方案。第一個方法所產生的表會包含所有可能的方案，第二個表則只會產生足夠卸載掉目前工作量所必要的方案。實驗的結果證明我們兩個方法在維護這張表上所花的時間都比Aurora的方法快。

The data stream management system (DSMS) is an active research area in recent years. Since the number of available resources for a DSMS is limited and the input rates of data streams are unpredictable, ensuring a steady performance of the DSMS when the workload exceeds the system capacity is of critical importance. The previous work, Aurora, well motivates the problem of load shedding under the limited CPU cycles per time unit and proposes an approach to insert some operators for dropping data into the DSMS once the overloading situation is expected to occur. There can be various schemes to insert such operators. In Aurora, all the possible schemes are pre-computed and stored as a lookup table for handling the overloading situation. Due to the variable input rates of data streams, this table needs to be reconstructed. In this thesis, we follow the context of Aurora and propose approaches to adaptively reconstruct that table for efficient load shedding. In our approaches, we devise a novel method to check whether the table for load shedding should be modified by observing the input rates of data streams. However, some schemes in the table to be modified can still be reused. Therefore, we design two approaches to modify only some schemes in the table. One approach produces a table with all possible schemes. The other approach produces a table with only the necessary schemes, which is sufficient to shed the current workload. Experiment results show that both of our approaches outperform the Aurora’s approach on the time for maintaining the table for load shedding.

Abstract                                        i
Acknowledgements                                ii
Table of Contents                               iv
List of Figures                                 v
Chapter 1: Introduction                         1
Chapter 2: Preliminaries                        5
2.1 Terminology                                 5
2.2 Our Framework for Adaptive Load Shedding    10
Chapter 3: Load Shedding Road Map               12
3.1 Building the LSRM                           13
3.2 Using the LSRM                              19
Chapter 4: Adaptive Load Shedding               22
4.1 Complete LSRM Construction                  23
4.2 Adaptive LSRM Construction                  29
Chapter 5: Experiments                          35
5.1 Environment                                 35
5.2 Results                                     37
    5.2.1 LSRM Maintaining Cost Analysis        37
    5.2.2 Scalability Analysis                  39
Chapter 6: Conclusion                           41
Reference:                                      42

                                

[1] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, “Models and Issues in Data Stream Systems,” Proc. of ACM Symposium on Principles of Database Systems (PODS), 2002.
[2] M. Tubaishat and Sanjay Madria, “SENSOR NETWORKS: AN OVERVIEW,” IEEE Potentials, Vol. 22, No. 2, 2003.
[3] N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniach, and M. Stonebraker, “Load Shedding in a Data Stream Manager,” Proc. International Conference on Very Large Data Bases (VLDB), 2003.
[4] N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker, “Load Shedding on Data Streams,” Proc. ACM Workshop on Management and Processing of Data Streams (MPDS), 2003.
[5] B. Babcock, M. Datar, R. Motwani, “Load Shedding for Aggregation Queries over Data Streams,” Proc. International Conference on Data Engineering (ICDE), 2004
[6] R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, R. Varma, “Query Processing, Approximation, and Resource Management in a Data Stream Management System,” Proc. of the Conference on Innovative Data Systems Research (CIDR), 2003.
[7] J. Kang, J. F. Naughton, S. D. Viglas, “Evaluating Window Joins over Unbounded Streams,” Proc. International Conference on Data Engineering (ICDE), 2003.
[8] Y. Zhu and Dennis Shasha, “Efficient Elastic Burst Detection in Data Streams,” The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD, 2003.
[9] A. Zhou, S. Qin, and W. Qian, “Adaptively Detecting Aggregation Bursts in Data Streams,” International Conference on Database Systems for Advanced Applications (DASFAA), 2005.
[10] D. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik, “Aurora: A New Model and Architecture for Data Stream Management,” VLDB Journal, 2003.
[11] http://www.cs.brown.edu/research/aurora/
[12] http://www-db.stanford.edu/stream/
[13] http://telegraph.cs.berkeley.edu/
[14] http://www.cs.cornell.edu/database/cougar/
[15] http://telegraph.cs.berkeley.edu/tinydb/index.htm

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文