研究生: |
賈醫菱 Jia, Yi Ling |
---|---|
論文名稱: |
以感域雜湊法加速軌跡查詢 Fast Trajectory Query via Locality-Sensitive Hashing |
指導教授: |
李哲榮
Lee, Che-Rung |
口試委員: |
彭文志
Peng, Wen-Chih 徐正炘 Hsu, Cheng-Hsin |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2016 |
畢業學年度: | 104 |
語文別: | 英文 |
論文頁數: | 31 |
中文關鍵詞: | GPS軌跡 、軌跡相似性 、向量場 |
外文關鍵詞: | GPS trajectory, Trajectory similarity, LCSS, Vector field |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著可攜帶GPS設備的快速發展和普及,越來越多的軌跡資料被收集、存儲並應用於各類資料分析當中。而以獲取某條給定軌跡的相似軌跡為目的的軌跡相似性研究是軌跡分析的基礎。本論文中,我們對軌跡資料在向量場內進行建模,基於此模型,我們可以在此向量空間內度量兩條軌跡的相似程度。同時在此模型下,大多數不相似的軌跡可以通過感域雜湊法(Locality-Sensitive Hashing)過濾掉,藉此實現快速有效的軌跡查詢。我們使用Geolife資料集进行了實驗,結果表明此方法能夠在召回率接近98%的條件下,滤除接近70%不相似的軌跡。不僅如此,在進行同樣萬次軌跡查詢的條件下,我們的方法比傳統的最長公共子序列法(Longest Common Sub-Sequence)在時間效率上有百倍的提升。
With the increasing number of mobile GPS devices, more and more trajectory data are collected, stored, and analyzed for various applications. One of the basic operations in trajectory analysis is the similarity query, which retrieves similar trajectories of a given one. In this thesis, we model trajectory data as vector fields, by which the similarity between two trajectories can be measured in the vector space that they are transformed to. We call this algorithm of similarity measure for trajectory data Cosine Similarity for Vector Filed (CSVF). With such model, trajectory queries can be performed efficiently using Locality Sensitive Hashing (LSH) to filter out most dissimilar trajectories. Experiments which use Geolife dataset demonstrate that LSH can filter out nearly 70% candidate trajectories while maintaining the recall close to 98%. Meanwhile, experiments show that CSVF is 100 times faster than the tradition Longest Common Subsequence(LCSS) method when querying ten thousands of trajectory data.
[1] Niels Agatz, Alan Erera, Martin Savelsbergh, and Xing Wang. Optimization
for dynamic ride-sharing: A review. European Journal of Operational Research,
223(2):295–303, 2012.
[2] Gerardo Berbeglia, Jean-Fran ̧cois Cordeau, and Gilbert Laporte. Dynamic
pickup and delivery problems.
European journal of operational research,
202(1):8–15, 2010.
[3] Donald J Berndt and James Clifford. Using dynamic time warping to find
patterns in time series. In KDD workshop, volume 10, pages 359–370. Seattle,
WA, 1994.
[4] Moses S Charikar. Similarity estimation techniques from rounding algorithms.
In Proceedings of the thiry-fourth annual ACM symposium on Theory of com-
puting, pages 380–388. ACM, 2002
[5] Lei Chen, M Tamer Ozsu,
and Vincent Oria. Robust and fast similarity search
for moving object trajectories. In Proceedings of the 2005 ACM SIGMOD
international conference on Management of data, pages 491–502. ACM, 2005.
[6] Ling Chen, Mingqi Lv, Qian Ye, Gencai Chen, and John Woodward. A personal
route prediction system based on trajectory data mining. Information Sciences,
181(7):1264–1284, 2011.
[7] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-
sensitive hashing scheme based on p-stable distributions. In Proceedings of the
twentieth annual symposium on Computational geometry, pages 253–262. ACM,
2004.
[8] Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn
Keogh. Querying and mining of time series data: experimental comparison of
representations and distance measures. Proceedings of the VLDB Endowment,
1(2):1542–1552, 2008.
[9] Keivan Ghoseiri, Ali Ebadollahzadeh Haghani, Masoud Hamedi, and Mid-
Atlantic Universities Transportation Center.
Real-time rideshare matching
problem. Mid-Atlantic Universities Transportation Center, 2011.
[10] Wen He, Deyi Li, Tianlei Zhang, Lifeng An, Mu Guo, and Guisheng Chen.
Mining regular routes from gps data for ridesharing recommendations. In Pro-
ceedings of the ACM SIGKDD International Workshop on Urban Computing,
pages 79–86. ACM, 2012.
[11] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards
removing the curse of dimensionality. In Proceedings of the thirtieth annual
ACM symposium on Theory of computing, pages 604–613. ACM, 1998.
[12] Georgios Kellaris, Nikos Pelekis, and Yannis Theodoridis. Trajectory compres-
sion under network constraints. In International Symposium on Spatial and
Temporal Databases, pages 392–398. Springer, 2009.
[13] Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hector Gonzalez. Traclass: trajectory
classification using hierarchical region-based and trajectory-based clustering.
Proceedings of the VLDB Endowment, 1(1):1081–1094, 2008.
[14] Jae-Gil Lee, Jiawei Han, and Kyu-Young Whang. Trajectory clustering: a
partition-and-group framework. In Proceedings of the 2007 ACM SIGMOD
international conference on Management of data, pages 593–604. ACM, 2007.
[15] Nirvana Meratnia and A Rolf. Spatiotemporal compression techniques for mov-
ing point objects. In International Conference on Extending Database Technol-
ogy, pages 765–782. Springer, 2004.
[16] Hsu Oscar Li Jen and Lee Che-Rung. Approach for finding ridesharing paths in
spatiotemporal space. Proceedings of Spatiotemporal Space Urban Computing,
pages 37–42, 2016.
[17] Sayan Ranu, P Deepak, Aditya D Telang, Prasad Deshpande, and Sriram
Raghavan. Indexing and matching trajectories under inconsistent sampling
rates. In Data Engineering (ICDE), 2015 IEEE 31st International Conference
on, pages 999–1010. IEEE, 2015.
[18] Gook-Pil Roh and Seung-won Hwang. Tpm: Supporting pattern matching
queries for road-network trajectory data. In Proceedings of the 14th Interna-
tional Conference on Extending Database Technology, pages 554–557. ACM,
2011.
[19] Gook-Pil Roh, Jong-Won Roh, Seung-Won Hwang, and Byoung-Kee Yi. Sup-
porting pattern-matching queries over trajectories on road networks. Knowledge
and Data Engineering, IEEE Transactions on, 23(11):1753–1758, 2011.
[20] Yasushi Sakurai, Masatoshi Yoshikawa, and Christos Faloutsos. Ftw: fast sim-
ilarity search under the time warping distance. In Proceedings of the twenty-
fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database
systems, pages 326–337. ACM, 2005.
[21] Renchu Song, Weiwei Sun, Baihua Zheng, and Yu Zheng. Press: A novel
framework of trajectory compression in road networks. Proceedings of the VLDB
Endowment, 7(9):661–672, 2014.
[22] Michail Vlachos, George Kollios, and Dimitrios Gunopulos. Discovering similar
multidimensional trajectories. In Data Engineering, 2002. Proceedings. 18th
International Conference on, pages 673–684. IEEE, 2002.
[23] Byoung-Kee Yi, HV Jagadish, and Christos Faloutsos. Efficient retrieval of
similar time sequences under time warping. In Data Engineering, 1998. Pro-
ceedings., 14th International Conference on, pages 201–208. IEEE, 1998.
[24] Yu Zheng, Yanchi Liu, Jing Yuan, and Xing Xie. Urban computing with taxi-
cabs. In Proceedings of the 13th international conference on Ubiquitous com-
puting, pages 89–98. ACM, 2011.