簡易檢索 / 詳目顯示

研究生: 黃國柱
Huang, Guo-Jhu
論文名稱: 於空間資料庫考量雙資料型態的反向最近點之涵蓋最大化
Coverage Maximization on Spatial Databases Considering Bi-chromatic Reverse k-Nearest Neighbors
指導教授: 陳良弼
Chen, L P.
口試委員: 柯佳伶
顏秀珍
陳良弼
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2011
畢業學年度: 100
語文別: 英文
論文頁數: 37
中文關鍵詞: 反向最近點雙資料型態反向最近點最大化
外文關鍵詞: RkNN, BRkNN, Maximization
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一個反向前k近鄰居查詢(Reverse k nearest neighbors, RkNN)是擷取那些把查詢點(query point)當成前k近鄰居的資料點(data point)。而雙資料形態反向前k近鄰居查詢(Bi-chromatic reverse k nearest neighbors, BRkNN)是反向前k近鄰居查詢的變化,它考慮了兩種不同型態的資料組(dataset)。給定兩組不同型態的資料組S和C,對於一個在S中的查詢點做雙資料形態反向前k近鄰居查詢,它會擷取那些把查詢點當成前k近鄰居的資料點。現存的很多方法只針對單一個查詢點做雙資料形態反向前k近鄰居查詢,但是,對於實際應用方面來說,我們可能會希望從資料組S中找出多的查詢點,而讓這些查詢點它們雙資料形態反向前k近鄰居查詢所獲得的資料點集聯集起來是最大化。我們稱這個問題叫在空間資料中考慮雙資料形態反向前k近鄰居的涵蓋最大化。對於這個問題來說,如果我們對每個在資料組S中的每的點先找尋它自己的雙資料形態反向前k近鄰居查詢的答案,然後挑選一些答案點集(answer set)最大的當成這個問題的輸出(output),這樣的結果會導致較差的品質,因為這些輸出點它們雙資料形態反向前k近鄰居查詢的答案點集可能會重疊的非常嚴重。而這個問題也被證明是個NP-hard的問題,因此,我們設計了兩個經驗法則(heuristic)的方法來解決這個問題,分別考慮了不同的標準:時間的效率跟結果的品質。我們將在合成資料(synthetic data)和實際資料(real data)執行一系列的實驗來測量我們所提出的兩個方法。


    A Reverse k-Nearest-Neighbors (RkNN) query retrieves the data points which take the query point as one of their k nearest neighbors. A bi-chromatic reverse k-nearest neighbor (BRkNN) query is a variant of the RkNN query, considering two types of data. Given two types of datasets S and C, a BRkNN query regarding a data point q in S retrieves the data points from C that regard q as one of their corresponding k-nearest neighbors on S. Many existing approaches answer the BRkNN query regarding a data point q in S individually. However, for the real applications, people may hope to find some points from S that maximize the cardinality of the union of their BRkNN answer sets. Here, we call this problem the coverage maximization on spatial databases considering BRkNN (the coverage maximization problem in short). Computing the BRkNN answer set for each point in S and then choosing some of them with the largest BRkNN answer sets for solving the maximization problem may cause poor quality, since the BRkNN answer sets of the data points in S may overlap. In this thesis, we design two heuristic approaches to solve this problem, considering different criteria on time efficiency and answer set quality. A series of experiments on synthetic and real datasets are performed to evaluate these two approaches.

    Acknowledgement i Abstract ii Table of Contents iii List of Figures iv 1 Introduction 1 2 Related Works 4 3 Preliminaries 6 3.1 Notations and Problem Definition 6 3.2 Index Structure 8 3.3 Properties of Voronoi Diagram 9 3.4 Pre-processing 10 4 BrMax-t Query Processing 13 4.1 Single BRkNN Processing 13 4.2 Incremental Top-1 Processing 13 4.3 Exchange Processing 16 5 Performance Evaluation 21 5.1 Descriptions of Test Datasets and Experiment Factors 21 5.2 Results of BrMax-t query 23 6 Conclusions 35 References 36

    [CLZWZ09] M.A. Cheema, X. Lin, Y. Zhang, W. Wang, and W. Zhang, “Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN,” PVLDB 2, 1, (Aug. 2009) pp. 1138-1149.
    [F98] Uriel Feige, “A Threshold of ln n for Approximating Set Cover,”Journal of the ACM, Vol. 45, No. 4, July 1998, pp. 634 –652.
    [GZCLLL09] Y. Gao, B. Zheng, G. Chen, W.-C. Lee, Ken C. K. Lee, and Q. Li, “Visible Reverse k-Nearest Neighbor Queries,” In Proceedings of the 25th International Conference on Data Engineering, ICDE2009, Shanghai, China, pp. 1203-1206.
    [KKRZK09] H.-P. Kriegel, P. Kroger, M. Renz, A. Zufle, and A. Katzdobler, “Incremental Reverse Nearest Neighbor Ranking,” In Proceedings of the 25th International Conference on Data Engineering, ICDE2009, Shanghai, China, pp. 1560-1567.
    [KM00] Flip Korn, S. Muthukrishnan, “Influence Set Base on Reverse Nearest Neighbor,” In Proceedings of the 19th ACM SIGMOD International Conference on Management of Data, SIGMOD2000, Dallas, Texas, pp. 201-212.
    [KMSXZ07] J. M. Kang, M. F. Mokbel, S. Shekhar, T. Xia, and D. Zhang, “Continuous Evaluation of Monochromatic and Bichromatic Reverse Nearest Neighbors,” In Proceedings of the 23rd International Conference on Data Engineering, ICDE2007, Istanbul, Turkey, pp. 806-815.
    [LWC10] C.L. Li, E.T. Wang, and A.L.P. Chen, “Top-N Query Processing on Spatial Databases Considering Bi-chromatic Reverse k-Nearest Neighbors,” Master’s Thesis, in Taiwan, National Tsing Hua University, 2010. (To submitted to EDBT’12)
    [SAA00] I. Stanoi, D. Agrawal, and A.E. Abbadi, “Reverse Nearest Neighbor for Dynamic Database,” In Proceedings of the 19th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, SIGMOD2000, Dallas, Texas, pp. 44-53.
    [SS10] Mehdi Sharifzadeh, Cyrus Shahabi, “VoR-Tree: R-trees with Voronoi Diagrams for Efficient Processing of Spatial Nearest Neighbor Queries,” In Proceedings of the 36th Very Large Data Base, VLDB2010, Singapore, pp. 1231-1242.
    [TPL04] Y. Tao, D. Papadias, and X. Lian, “Reverse kNN Search in Arbitrary Dimensionality,” In Proceedings of the 30th Very Large Data Base, VLDB2004, Toronto, Canada, pp. 744-755.
    [WOYFL09] R.C.W. Wong, M.T. Ozsu, P.S. Yu, A.W.C. Fu, and L. Liu, “Efficcient Method for Maximizing Bichromatic Reverse Nearest Neighbor,” PVLDB 2, 1, (Aug. 2009) pp. 1126-1137.
    [WYCT08] W. Wu, F. Yang, C.Y. Chan, and K.L. Tan, “FINCH: Evaluating Reverse K-Nearest-Neighbor Queries on Location Data,” PVLDB 1, 1 (August. 2008) pp. 1056-1067.
    [XZKD05] T. Xia, D. Zhang, E. Kanoulas, and Y. Du, “On Computing Top-t Most Influential Spatial Sites,” In Proceedings of the 31st Very Large Data Base, VLDB2005, Trondheim, Norway, pp. 946-957.
    [YPMT05] M.L. Yiu, D. Papadias, N. Mamoulis, and Y. Tao, “Reverse Nearest Neighbors in Large Graphs,” In Proceedings of the 21st International Conference on Data Engineering, ICDE2005, Tokyo, Japan, pp.186-187.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE