研究生: |
范耀中 Fan, Yao-Chung |
---|---|
論文名稱: |
Efficient and Robust Schemes for Accuracy-Guaranteed Sensor Data Aggregation 具高能源效率與高可靠度之感測資料聚合計算方法設計 |
指導教授: |
陳良弼
Chen, Arbee L. P. |
口試委員: | |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2010 |
畢業學年度: | 99 |
語文別: | 英文 |
論文頁數: | 73 |
中文關鍵詞: | 無線感測網路 、速寫結構 、聚合計算 |
外文關鍵詞: | sensor network, sketch, synopsis |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Sensor networks have received considerable attention in recent years, and are often employed in the applications where data are difficult or expensive to collect. In these applications, in addition to individual sensor readings, statistical aggregates such as Min and Count over the readings of a group of sensor nodes are often needed. Possible ap-plications using aggregates are the average reading reporting, the number of active sen-sor counting, and the maximal noxious-gas density region finding. To conserve re-sources for sensor nodes, in-network aggregation strategies are commonly adopted, which process the data collected from sensor nodes during routing trips, and therefore significantly reduce the number of messages the network has to transmit. However, the in-network aggregation strategies often are not robust against communication failures, which are common in sensor networks. One communication failure may cause the in-formation of a large group of sensor nodes to be missed out in the final aggregate, caus-ing an inaccurate aggregate result.
To enhance the robustness of the aggregate computation, multi-path-based aggrega-tion is often used. However, the multi-path-based aggregation suffers from the problem of overcounting sensor readings. The approaches using the multi-path-based aggrega-tion therefore need to incorporate with techniques that avoid overcounting sensor read-ings through representing sensor readings by duplicate-insensitive synopses. With the du-plicate-insensitive property, each sensor reading is counted only once, and therefore the overcounting problem is avoided. Many previously known techniques can be used for producing duplicate-insensitive synopses, but each has its own shortcomings. Under-standing, analyzing, and developing duplicate-insensitive synopses are the focuses of my dissertation. This dissertation summarizes my efforts towards the design of dupli-cate-insensitive synopses. We focus on having an (ε, δ) accuracy guarantee for comput-ing an aggregate, which ensures that the error in computing the aggregate is within a factor of ε with probability (1 – δ). We propose a family of techniques for producing du-plicate-insensitive synopses based on linear counting. Our schemes using the proposed techniques efficiently compute the aggregates under a given accuracy guarantee. We provide theoretical analyses that show the advantages of the proposed techniques over previously known techniques in both space and time complexity. Furthermore, extensive experiments are made to validate the theoretical results and manifest the advantages of using the proposed techniques for sensor data aggregation.
隨著科技的發展,內嵌無線通訊、精密感測、計算等功能之感測器裝置的使用已日漸地成熟與普及。具通訊與計算能力的智慧型感測器裝置,提供了一個全新的資料收集模式,也創造出多元化的應用。在可想見的未來,無線感測器系統將大規模地融入人們的生活環境,提供大量、即時、且各式各樣的感測器資料,因此如何駕馭、管理與應用智慧型感測器系統所蒐集的資料,即成為一個極具挑戰性的研究課題。由於感測器裝置的硬體設計以成本為考量,因此感測器裝置本身所配備資源,諸如能源供應等,均受限制。因此,如何在資源受限的條件下,設計高效率的資料收集與傳輸技術,成為相關研究的焦點所在。在無線感測器網路應用中,除了單一節點資料的收集外,針對一群感測器節點資料之聚合查詢,如平均、總和、最大值等聚合值,也廣為使用。常見的感測器資料聚合查詢應用諸如,平均雨量的回報、最大濕度節點的尋找等。
現有最常見的感測器資料聚合方式為樹狀式聚合計算。樹狀式聚合計算,首先對感測器網路,建構一支以主機為根節點的擴張樹,連接網路節點。而後資料聚合計算則由葉節點開始層層地進行:各節點接收其子節點所傳送至之部份聚合值,並依本身資料與所接收之聚合值,計算出新的部份聚合值,往其母節點傳送;層層進行,最終於根節點計算出最終聚合值。然而,樹狀式聚合計算其缺點為通訊容錯能力不佳。現有感測器節點採用的通訊協定主要著眼點為低功耗的無線通訊。該協定確實提升能源的利用率,但代價為約百分之三十之封包流失率。也因此,樹狀式聚合計算被採用時,許多部份聚合值將因感測器節點間通訊上的失敗而遺失。單一節點部份聚合值的傳遞失敗,將造成以該感測器為根節點之子樹資料遺失,使得最終計算出來的聚合值遠偏離實際值。
欲提升感測器資料聚合計算上的通訊容錯能力,最直接的方法為採用可靠率較高的協定。然而,相對地需付出高感測器能源消耗之代價,造成能源管理使用上的負擔。因此,在使用現階段通訊協定前提下,以多路徑式傳遞為基礎來提升通訊容錯力之多路徑式聚合計算,近來則廣為相關研究所注意。有別於樹狀式聚合計算,在多路徑式聚合計算中,感測器節點廣播其部分聚合值至其上一層的節點。由於一筆部分聚合值有多筆複本在網路中傳遞與計算,唯有在所有複本皆遺失的狀況下,才會造成該部分聚合值的遺失,也因此在原本的協定下,大幅提升了聚合計算之容錯能力。
但也由於一筆部分聚合值有多筆複本在網路中傳遞,相同的資料可能會被接收與計算多次,衍生出重複計數的問題。因此,多路徑式聚合計算多結合複本無影響性之速寫結構(duplicate-insensitive synopses)來避免重複記數的問題。現有許多技術可被引用來產生複本無影響性之速寫結構,然而現有技術的直接引用帶來不同的優缺點。分析、了解、設計複本無影響性之速寫結構為本博士論文之研究主軸。本論文提出一系列之技術來生成複本無影響性之速寫結構,且提供資料聚合計算誤差結果之保證。並於論文中透由複雜度理論分析論文中所提出之技術,也透過完整的實驗模擬驗證本論文所提出方法之優越性與正確性。
[1] S. Madden, M. J. Franklin, J. M. Hellerstein et al., “TAG: a Tiny AGgregation service for ad-hoc sensor networks,” SIGOPS Operating System Review., vol. 36, no. SI, pp. 131-146, 2002.
[2] S. R. Madden, M. J. Franklin, J. M. Hellerstein et al., “TinyDB: an acquisitional query processing system for sensor networks,” ACM Transaction on Database Sys-tems, vol. 30, no. 1, pp. 122-173, 2005.
[3] Y. Yao, and J. Gehrke, “The cougar approach to in-network query processing in sensor networks,” SIGMOD Record, vol. 31, no. 3, pp. 9-18, 2002.
[4] J. Considine, M. Hadjieleftheriou, F. Li et al., “Robust approximate aggregation in sensor data management systems,” ACM Transactions on Database Systems, vol. 34, no. 1, pp. 1-35, 2009.
[5] Y.-C. Fan, and A. L. P. Chen, “Efficient and robust sensor data aggregation using linear counting sketches,” in 22nd IEEE International Symposium on Parallel and Distributed Processing Miami, Florida 2008, pp. 1-12.
[6] A. Manjhi, S. Nath, and P. B. Gibbons, “Tributaries and deltas: efficient and ro-bust aggregation in sensor network streams,” in 2005 ACM SIGMOD interna-tional conference on Management of Data, Baltimore, Maryland, 2005, pp. 287-298.
[7] S. Nath, P. B. Gibbons, S. Seshan et al., “Synopsis diffusion for robust aggregation in sensor networks,” ACM Transaction on Sensor Network, vol. 4, no. 2, pp. 1-40, 2008.
[8] N. Alon, Y. Matias, and M. Szegedy, “The space complexity of approximating the frequency moments,” in 28th annual ACM symposium on Theory of Computing, Philadelphia, Pennsylvania, United States, 1996, pp. 137-147.
[9] M. Durand, and P. Flajolet, “LogLog Counting of Large Cardinalities,” in Euro-pean Symps. on Algorithms, 2003, pp. 605--617.
[10] P. Flajolet, and G. N. Martin, “Probabilistic counting algorithms for data base ap-plications,” Journal of Computer and System Science, vol. 31, no. 2, pp. 182-209, 1985.
[11] K.-Y. Whang, B. T. Vander-Zanden, and H. M. Taylor, “A linear-time probabilistic counting algorithm for database applications,” ACM Transaction on Database Sys-tems, vol. 15, no. 2, pp. 208-229, 1990.
[12] Y.-C. Fan, and A. L. P. Chen, “Efficient and robust schemes for sensor data ag-gregation based on linear counting,” IEEE Transactions on Parallel and Distributed Systems, 2009, (Accepted for Publication).
[13] Y.-C. Fan, and A. L. P. Chen, “Energy Efficient Schemes for Accuracy-Guaranteed Sensor Data Aggregation using Scalable Counting,” IEEE Transactions on Knowl-edge and Data Engineering, under review 2010.
[14] S. Nath, P. B. Gibbons, S. Seshan et al., “Synopsis diffusion for robust aggregation in sensor networks,” in 2nd international conference on Embedded Networked Sensor Systems, Baltimore, MD, USA, 2004, pp. 250-262.
[15] Y.-C. Fan, W.-C. Lee, and A. L.P. Chen, “Space-efficient Synopses to Distinct Sum Problem over Distributed Data Sites,” in International Conference on Data Engi-neering (ICDE2011), 2011.(Submitted)
[16] A. Silberstein, R. Braynard, and J. Yang, “Constraint chaining: on energy-efficient continuous monitoring in sensor networks,” in Proceedings of the 2006 ACM SIGMOD international conference on Management of data, Chicago, IL, USA, 2006.
[17] Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar et al., “Counting Distinct Elements in a Data Stream,” in 6th International Workshop on Randomization and Approxima-tion Techniques, 2002, pp. 1-10.
[18] D. Woodruff, “Optimal space lower bounds for all frequency moments,” in 15th annual ACM-SIAM symposium on Discrete algorithms, New Orleans, Louisiana, 2004, pp. 167-175.
[19] M. Charikar, S. Chaudhuri, R. Motwani et al., “Towards estimation error guaran-tees for distinct values,” in 18th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, Dallas, Texas, United States, 2000, pp. 268-279.
[20] N. Shrivastava, C. Buragohain, D. Agrawal et al., “Medians and beyond: new ag-gregation techniques for sensor networks,” in 2nd international conference on Embedded Networked Sensor Systems, Baltimore, MD, USA, 2004, pp. 239-249.
[21] G. Cormode, and S. Muthukrishnan, “An improved data stream summary: the count-min sketch and its applications,” Journal of Algorithms, vol. 55, no. 1, pp. 58-75, 2005.
[22] A. Pavan, and S. Tirthapura, “Range-Efficient Computation of F0 over Massive Data Streams,” in 21st International Conference on Data Engineering, 2005, pp. 32-43.
[23] J. Y. Chen, G. Pandurangan, and D. Xu, “Robust Computation of Aggregates in Wireless Sensor Networks: Distributed Randomized Algorithms and Analysis,” IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 9, pp. 987-1000, 2006.
[24] S. Boyd, A. Ghosh, B. Prabhakar et al., “Gossip algorithms: design, analysis and applications,” in 24th Annual Joint Conference of the IEEE Computer and Com-munications, 2005, pp. 1653-1664
[25] C. Estan, G. Varghese, and M. Fisk, “Bitmap algorithms for counting active flows on high-speed links,” IEEE/ACM Transaction on Networking, vol. 14, no. 5, pp. 925-937, 2006.
[26] G. Barrenetxea, F. Ingelrest, G. Schaefer et al., “The hitchhiker's guide to success-ful wireless sensor network deployments,” in 6th ACM conference on Embedded Network Sensor Systems, Raleigh, NC, USA, 2008, pp. 43-56.
[27] O. Gnawali, K.-Y. Jang, J. Paek et al., “The tenet architecture for tiered sensor networks,” in 4th international conference on Embedded Networked Sensor Sys-tems, Boulder, Colorado, USA, 2006, pp. 153-166.