簡易檢索 / 詳目顯示

研究生: 黃鈺程
Huang, Yu-Cheng
論文名稱: 使用 Sparse R-CNN 與空間不確性進行多目標跟蹤
Multiple Object Tracking using Sparse R-CNN with Spatial Uncertainty
指導教授: 賴尚宏
Lai, Shang-Hong
口試委員: 黃敬群
Huang, Ching-Chun
陳駿丞
Chen, Jun-Cheng
江振國
Chiang, Chen-Kuo
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 41
中文關鍵詞: 多目標跟蹤空間不確定性
外文關鍵詞: Spatial Uncertainty, Sparse R-CNN
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在多目標跟蹤中,遮擋是一個非常難以解決的問題。當跟蹤的目標被(部份)遮擋時,我們偵測到的bounding box 與抽取出來的外觀特徵都是不精準的。本文描述我們如何使用空間不確定性處理這種情況。首先,我們將 Sparse R-CNN 擴展成帶有空間不確性預測的物體偵測模型,空間不確定性描述了物體在各方向上被遮擋的程度。然後,我們利用這些不確定性來輔助運動模型與外觀特徵的計算,最終完成多目標跟蹤。我們的多目標跟蹤器 SPRCNN_SU 在 MOT16, MOT17, MOT20 三個 Challenge 都達到領先的結果,MOTA 分別是 73.6%, 72.9% 與 67.1%。


    Occlusion poses a threat to the success of multiple object tracking (MOT). When an occlusion occurs, the detected bounding boxes and their appearances may not provide reliable information for object tracking. In this work, we address this issue by introducing spatial uncertainty into MOT. First, we extend Sparse R-CNN to predict associated spatial uncertainty in addition to bounding boxes and confidence scores. Spatial uncertainty models the degree of occlusion at each direction and is then utilized in motion modeling and appearance embedding estimation. Our tracker, SPRCNN_SU, achieves cutting edge performance on three popular challenges MOT16, MOT17, MOT20. Our tracker yields 73.6%, 72.9% and 67.1% MOTA, respectively.

    1 Introduction 1 1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Related Works 5 2.1 Tracking by Detection . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Tracking by Propagation . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Object Detection with Spatial Uncertainty . . . . . . . . . . . . . . 6 3 Proposed Method 7 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Sparse R-CNN. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Sparse R-CNN with Spatial Uncertainty . . . . . . . . . . . . . . . 15 3.4 Association with Spatial Uncertainty . . . . . . . . . . . . . . . . . 19 3.4.1 Geometric . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.2 Appearance Embedding . . . . . . . . . . . . . . . . . . . 20 4 Experiments 24 4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.4.1 Spatial Uncertainty v.s. Occlusion Ratio . . . . . . . . . . . 28 4.4.2 Object Detector with Spatial Uncertainty . . . . . . . . . . 28 4.4.3 Association with Spatial Uncertainty . . . . . . . . . . . . . 31 4.5 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5 Conclusions 37 References 38

    1. P. Sun, R. Zhang, Y. Jiang, T. Kong, C. Xu, W. Zhan, M. Tomizuka, L. Li, Z. Yuan, C. Wang, and P. Luo, "SparseR-CNN: End-to-End Object Detection with Learnable Proposals" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
    2. Y. He, C. Zhu, J. Wang, M. Savvides, and X. Zhang, "Bounding Box Regression With Uncertainty for Accurate Object Detection" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
    3. E. Bochinski, V. Eiselein, and T. Sikora, "High-Speed Tracking-by-Detection Without Using Image Information" in International Workshop on Traffic and Street Surveillance for Safety and Security at IEEE AVSS, 2017
    4. A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, "Simple online and realtime tracking" in 2016 IEEE International Conference on Image Processing (ICIP), 2016
    5. R.bibinitdelim E. Kalman, "A new approach to linear filtering and prediction problems" in Journal of basic Engineering, 1960
    6. H.bibinitdelim W. Kuhn, and B. Yaw, "The Hungarian method for the assignment problem" in Naval Res. Logist. Quart, 1955
    7. N. Wojke, A. Bewley, and D. Paulus, "Simple Online and Realtime Tracking with a Deep Association Metric" in The IEEE International Conference on Image Processing (ICIP), 2017
    8. Y. Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, "FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking" in International Journal of Computer Vision (IJCV), 2021
    9. X. Zhou, D. Wang, and P. Kr{ä, "Objects as Points" in arXiv preprint arXiv:1904.07850, 2019
    10. L. Zhang, Y. Li, and R. Nevatia, "Global data association for multi-object tracking using network flows" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008
    11. G. Brasó, and L. Leal-Taixé, "Learning a Neural Solver for Multiple Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    12. P. Bergmann, T. Meinhardt, and L. Leal-Taix{é, "Tracking Without Bells and Whistles" in The IEEE International Conference on Computer Vision (ICCV), 2019
    13. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks" in Advances in Neural Information Processing Systems, 2015
    14. X. Zhou, V. Koltun, and P. Kr{ä, "Tracking Objects as Points" in European Conference on Computer Vision (ECCV), 2020
    15. T. Meinhardt, A. Kirillov, L. Leal-Taixe, and C. Feichtenhofer, "TrackFormer: Multi-Object Tracking with Transformers" in arXiv preprint arXiv:2101.02702, 2021
    16. X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, "Deformable DETR: Deformable Transformers for End-to-End Object Detection" in The International Conference on Learning Representations (ICLR), 2021
    17. B. Shuai, A. Berneshawi, X. Li, D. Modolo, and J. Tighe, "SiamMOT: Siamese Multi-Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
    18. Y. Gal, and Z. Ghahramani, "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning" in The International Conference on Learning Representations (ICLR), 2016
    19. M. Kampffmeyer, A-B. Salberg, and R. Jenssen, "Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks" in The IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2016
    20. A. Kendall, and Y. Gal, "What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?" in Advances in Neural Information Processing Systems, 2017
    21. D. Hall, F. Dayoub, J. Skinner, H. Zhang, D. Miller, P. Corke, G. Carneiro, A. Angelova, and N. Sünderhauf, "Probabilistic Object Detection: Definition and Evaluation" in The Winter Conference on Applications of Computer Vision, 2020
    22. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-End Object Detection with Transformers" in European Conference on Computer Vision (ECCV), 2020
    23. T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal Loss for Dense Object Detection" in The IEEE International Conference on Computer Vision (ICCV), 2017
    24. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
    25. R. Girshick, "Fast R-CNN" in The IEEE International Conference on Computer Vision (ICCV), 2015
    26. A. Milan, L. Leal-Taix{é, I.bibinitdelim D. Reid, S. Roth, and K. Schindler, "MOT16: A Benchmark for Multi-Object Tracking" in CoRR, 2016
    27. P. Dendorfer, A. Ošep, A. Milan, K. Schindler, D. Cremers, I. Reid, S. Roth, and L. Leal-Taixé, "MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking" in International Journal of Computer Vision (IJCV), 2020
    28. P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, and L. Leal-Taixé, "MOT20: A benchmark for multi object tracking in crowded scenes" in arXiv preprint arXiv:2003.09003, 2020
    29. K. Bernardin, and R. Stiefelhagen, "Evaluating Multiple Object Tracking Performance:The CLEAR MOT Metrics" in EURASIP Journal on Image and Video Processin, 2008
    30. E. Ristani, F. Solera, R.bibinitdelim S. Zou, R. Cucchiara, and C. Tomasi, "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking" in European Conference on Computer Vision (ECCV) Workshop, 2016
    31. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An Imperative Style, High-Performance Deep Learning Library" in Advances in Neural Information Processing Systems 32, 2019
    32. Y. Wu, A. Kirillov, F. Massa, W-Y. Lo, and R. Girshick, "Detectron2" in https://github.com/facebookresearch/detectron2, 2019
    33. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
    34. T-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature Pyramid Networks for Object Detection" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
    35. K. Zhou, Y. Yang, A. Cavallaro, and T. Xiang, "Omni-Scale Feature Learning for Person Re-Identification" in The IEEE International Conference on Computer Vision (ICCV), 2019
    36. K. Zhou, and T. Xiang, "Torchreid: A Library for Deep Learning Person Re-Identification in Pytorch" in arXiv preprint arXiv:1910.10093, 2019
    37. T-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C.bibinitdelim L. Zitnick, and P. Dollár, "Microsoft COCO: Common Objects in Context" in arXiv preprint arXiv:1405.0312, 2014
    38. I. Loshchilov, and F. Hutter, "Decoupled Weight Decay Regularization" in The International Conference on Learning Representations (ICLR), 2019
    39. S. Shao, Z. Zhao, B. Li, T. Xiao, G. Yu, X. Zhang, and J. Sun, "CrowdHuman: A Benchmark for Detecting Human in a Crowd" in arXiv preprint arXiv:1805.00123, 2018
    40. D. Freedman, R. Pisani, and R. Purves, "Statistics (international student edition)" in Pisani, R. Purves, 4th edn. WW Norton & Company, New York, 2007
    41. Y. Wang, K. Kitani, and X. Weng, "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks" in International Conference on Robotics and Automation (ICRA), 2021
    42. J. Wu, J. Cao, L. Song, Y. Wang, M. Yang, and J. Yuan, "Track to Detect and Segment: An Online Multi-Object Tracker" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
    43. J. Pang, L. Qiu, X. Li, H. Chen, Q. Li, T. Darrell, and F. Yu, "Quasi-Dense Similarity Learning for Multiple Object Tracking" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
    44. J. Peng, C. Wang, F. Wan, Y. Wu, Y. Wang, Y. Tai, C. Wang, J. Li, F. Huang, and Y. Fu, "Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking" in Proceedings of the European Conference on Computer Vision, 2020
    45. B. Pang, Y. Li, Y. Zhang, M. Li, and C. Lu, "TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model" in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    46. P. Sun, J. Cao, Y. Jiang, R. Zhang, E. Xie, Z. Yuan, C. Wang, and P. Luo, "TransTrack: Multiple-Object Tracking with Transformer" in arXiv preprint arXiv: 2012.15460, 2020
    47. P. Tokmakov, J. Li, W. Burgard, and A. Gaidon, "Learning to Track with Object Permanence" in The IEEE International Conference on Computer Vision (ICCV), 2021
    48. Y. Xu, Y. Ban, G. Delorme, C. Gan, D. Rus, and X. Alameda-Pineda, "TransCenter: Transformers with Dense Queries for Multiple-Object Tracking" in arXiv preprint arXiv:2103.15145, 2021
    49. C. Liang, Z. Zhang, Y. Lu, X. Zhou, B. Li, X. Ye, and J. Zou, "Rethinking the competition between detection and ReID in Multi-Object Tracking" in arXiv preprint arXiv:2010.12138, 2020

    QR CODE