研究生: |
黃鈺程 Huang, Yu-Cheng |
---|---|
論文名稱: |
使用 Sparse R-CNN 與空間不確性進行多目標跟蹤 Multiple Object Tracking using Sparse R-CNN with Spatial Uncertainty |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: |
黃敬群
Huang, Ching-Chun 陳駿丞 Chen, Jun-Cheng 江振國 Chiang, Chen-Kuo |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 41 |
中文關鍵詞: | 多目標跟蹤 、空間不確定性 |
外文關鍵詞: | Spatial Uncertainty, Sparse R-CNN |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在多目標跟蹤中,遮擋是一個非常難以解決的問題。當跟蹤的目標被(部份)遮擋時,我們偵測到的bounding box 與抽取出來的外觀特徵都是不精準的。本文描述我們如何使用空間不確定性處理這種情況。首先,我們將 Sparse R-CNN 擴展成帶有空間不確性預測的物體偵測模型,空間不確定性描述了物體在各方向上被遮擋的程度。然後,我們利用這些不確定性來輔助運動模型與外觀特徵的計算,最終完成多目標跟蹤。我們的多目標跟蹤器 SPRCNN_SU 在 MOT16, MOT17, MOT20 三個 Challenge 都達到領先的結果,MOTA 分別是 73.6%, 72.9% 與 67.1%。
Occlusion poses a threat to the success of multiple object tracking (MOT). When an occlusion occurs, the detected bounding boxes and their appearances may not provide reliable information for object tracking. In this work, we address this issue by introducing spatial uncertainty into MOT. First, we extend Sparse R-CNN to predict associated spatial uncertainty in addition to bounding boxes and confidence scores. Spatial uncertainty models the degree of occlusion at each direction and is then utilized in motion modeling and appearance embedding estimation. Our tracker, SPRCNN_SU, achieves cutting edge performance on three popular challenges MOT16, MOT17, MOT20. Our tracker yields 73.6%, 72.9% and 67.1% MOTA, respectively.
1. P. Sun, R. Zhang, Y. Jiang, T. Kong, C. Xu, W. Zhan, M. Tomizuka, L. Li, Z. Yuan, C. Wang, and P. Luo, "SparseR-CNN: End-to-End Object Detection with Learnable Proposals" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
2. Y. He, C. Zhu, J. Wang, M. Savvides, and X. Zhang, "Bounding Box Regression With Uncertainty for Accurate Object Detection" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
3. E. Bochinski, V. Eiselein, and T. Sikora, "High-Speed Tracking-by-Detection Without Using Image Information" in International Workshop on Traffic and Street Surveillance for Safety and Security at IEEE AVSS, 2017
4. A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, "Simple online and realtime tracking" in 2016 IEEE International Conference on Image Processing (ICIP), 2016
5. R.bibinitdelim E. Kalman, "A new approach to linear filtering and prediction problems" in Journal of basic Engineering, 1960
6. H.bibinitdelim W. Kuhn, and B. Yaw, "The Hungarian method for the assignment problem" in Naval Res. Logist. Quart, 1955
7. N. Wojke, A. Bewley, and D. Paulus, "Simple Online and Realtime Tracking with a Deep Association Metric" in The IEEE International Conference on Image Processing (ICIP), 2017
8. Y. Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, "FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking" in International Journal of Computer Vision (IJCV), 2021
9. X. Zhou, D. Wang, and P. Kr{ä, "Objects as Points" in arXiv preprint arXiv:1904.07850, 2019
10. L. Zhang, Y. Li, and R. Nevatia, "Global data association for multi-object tracking using network flows" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008
11. G. Brasó, and L. Leal-Taixé, "Learning a Neural Solver for Multiple Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
12. P. Bergmann, T. Meinhardt, and L. Leal-Taix{é, "Tracking Without Bells and Whistles" in The IEEE International Conference on Computer Vision (ICCV), 2019
13. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks" in Advances in Neural Information Processing Systems, 2015
14. X. Zhou, V. Koltun, and P. Kr{ä, "Tracking Objects as Points" in European Conference on Computer Vision (ECCV), 2020
15. T. Meinhardt, A. Kirillov, L. Leal-Taixe, and C. Feichtenhofer, "TrackFormer: Multi-Object Tracking with Transformers" in arXiv preprint arXiv:2101.02702, 2021
16. X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, "Deformable DETR: Deformable Transformers for End-to-End Object Detection" in The International Conference on Learning Representations (ICLR), 2021
17. B. Shuai, A. Berneshawi, X. Li, D. Modolo, and J. Tighe, "SiamMOT: Siamese Multi-Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
18. Y. Gal, and Z. Ghahramani, "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning" in The International Conference on Learning Representations (ICLR), 2016
19. M. Kampffmeyer, A-B. Salberg, and R. Jenssen, "Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks" in The IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2016
20. A. Kendall, and Y. Gal, "What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?" in Advances in Neural Information Processing Systems, 2017
21. D. Hall, F. Dayoub, J. Skinner, H. Zhang, D. Miller, P. Corke, G. Carneiro, A. Angelova, and N. Sünderhauf, "Probabilistic Object Detection: Definition and Evaluation" in The Winter Conference on Applications of Computer Vision, 2020
22. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-End Object Detection with Transformers" in European Conference on Computer Vision (ECCV), 2020
23. T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal Loss for Dense Object Detection" in The IEEE International Conference on Computer Vision (ICCV), 2017
24. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
25. R. Girshick, "Fast R-CNN" in The IEEE International Conference on Computer Vision (ICCV), 2015
26. A. Milan, L. Leal-Taix{é, I.bibinitdelim D. Reid, S. Roth, and K. Schindler, "MOT16: A Benchmark for Multi-Object Tracking" in CoRR, 2016
27. P. Dendorfer, A. Ošep, A. Milan, K. Schindler, D. Cremers, I. Reid, S. Roth, and L. Leal-Taixé, "MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking" in International Journal of Computer Vision (IJCV), 2020
28. P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, and L. Leal-Taixé, "MOT20: A benchmark for multi object tracking in crowded scenes" in arXiv preprint arXiv:2003.09003, 2020
29. K. Bernardin, and R. Stiefelhagen, "Evaluating Multiple Object Tracking Performance:The CLEAR MOT Metrics" in EURASIP Journal on Image and Video Processin, 2008
30. E. Ristani, F. Solera, R.bibinitdelim S. Zou, R. Cucchiara, and C. Tomasi, "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking" in European Conference on Computer Vision (ECCV) Workshop, 2016
31. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An Imperative Style, High-Performance Deep Learning Library" in Advances in Neural Information Processing Systems 32, 2019
32. Y. Wu, A. Kirillov, F. Massa, W-Y. Lo, and R. Girshick, "Detectron2" in https://github.com/facebookresearch/detectron2, 2019
33. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
34. T-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature Pyramid Networks for Object Detection" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
35. K. Zhou, Y. Yang, A. Cavallaro, and T. Xiang, "Omni-Scale Feature Learning for Person Re-Identification" in The IEEE International Conference on Computer Vision (ICCV), 2019
36. K. Zhou, and T. Xiang, "Torchreid: A Library for Deep Learning Person Re-Identification in Pytorch" in arXiv preprint arXiv:1910.10093, 2019
37. T-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C.bibinitdelim L. Zitnick, and P. Dollár, "Microsoft COCO: Common Objects in Context" in arXiv preprint arXiv:1405.0312, 2014
38. I. Loshchilov, and F. Hutter, "Decoupled Weight Decay Regularization" in The International Conference on Learning Representations (ICLR), 2019
39. S. Shao, Z. Zhao, B. Li, T. Xiao, G. Yu, X. Zhang, and J. Sun, "CrowdHuman: A Benchmark for Detecting Human in a Crowd" in arXiv preprint arXiv:1805.00123, 2018
40. D. Freedman, R. Pisani, and R. Purves, "Statistics (international student edition)" in Pisani, R. Purves, 4th edn. WW Norton & Company, New York, 2007
41. Y. Wang, K. Kitani, and X. Weng, "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks" in International Conference on Robotics and Automation (ICRA), 2021
42. J. Wu, J. Cao, L. Song, Y. Wang, M. Yang, and J. Yuan, "Track to Detect and Segment: An Online Multi-Object Tracker" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
43. J. Pang, L. Qiu, X. Li, H. Chen, Q. Li, T. Darrell, and F. Yu, "Quasi-Dense Similarity Learning for Multiple Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
44. J. Peng, C. Wang, F. Wan, Y. Wu, Y. Wang, Y. Tai, C. Wang, J. Li, F. Huang, and Y. Fu, "Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking" in Proceedings of the European Conference on Computer Vision, 2020
45. B. Pang, Y. Li, Y. Zhang, M. Li, and C. Lu, "TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
46. P. Sun, J. Cao, Y. Jiang, R. Zhang, E. Xie, Z. Yuan, C. Wang, and P. Luo, "TransTrack: Multiple-Object Tracking with Transformer" in arXiv preprint arXiv: 2012.15460, 2020
47. P. Tokmakov, J. Li, W. Burgard, and A. Gaidon, "Learning to Track with Object Permanence" in The IEEE International Conference on Computer Vision (ICCV), 2021
48. Y. Xu, Y. Ban, G. Delorme, C. Gan, D. Rus, and X. Alameda-Pineda, "TransCenter: Transformers with Dense Queries for Multiple-Object Tracking" in arXiv preprint arXiv:2103.15145, 2021
49. C. Liang, Z. Zhang, Y. Lu, X. Zhou, B. Li, X. Ye, and J. Zou, "Rethinking the competition between detection and ReID in Multi-Object Tracking" in arXiv preprint arXiv:2010.12138, 2020