簡易檢索 / 詳目顯示

研究生: 鄧雅心
Teng, Ya-Hsin
論文名稱: 基於卡爾曼過濾器與判別模型並使用深度與顏色資訊來處理物件追蹤時被遮蔽的問題
Occlusion Handling in Visual Tracking Based on Kalman Filters and Discriminative Models with Color and Depth Information
指導教授: 蘇豐文
Soo, Von-Wun
口試委員: 沈之涯
Shen, Chih-Ya
劉吉軒
Liu, Jyi-Shane
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 53
中文關鍵詞: 視覺物件追蹤深度學習遮蔽處理顏色及深度資訊
外文關鍵詞: VisualObjectTracking, DeepLearning, OcclusionHandling, RgbdInformation
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 影像物件追蹤從以前到現在一直是被研究的領域,從原本利用傳統影像處理的方法、評估判別模型到現在的深度學習。而在追蹤的期間,時常會遇到追蹤的物件被障礙物遮擋的問題,大多論文中處理遮蔽物的方法是評估目前目標物的位置或者搜尋整張影像找尋最類似目標物。假如遇到的遮蔽物與目標本身類似,這會導致追蹤演算法無法認清目標物跟遮蔽物的差別。而如果將此資訊再重新餵給追蹤模型,最終追蹤模型可能會因為訓練樣本而造成追蹤目標的準確率下降。如果我們能夠更精準的考慮何時目標物被遮蔽物遮擋,是否能更將模型訓練好,讓模型得以更加精準的判斷目標物。
    深度信息可以提供給我們目標物的3D位置,這給了我們更多的信息和對移動方向的更好的理解。在本文中,我們提出了一種基於深度及卡爾曼的遮擋來處理遮蔽物問題的視覺追蹤算法來幫助我們追蹤目標物。追蹤的模型是基於判別式模型來進行追蹤,遮蔽物處理的部分,除了顏色訊息以外,利用深度訊息來定義目標物何時被遮蔽,並且利用卡爾曼來評估被遮蔽的目標物的路線。除了卡爾曼預測的位置以外,還有多方位的位置評估來更加的精準找到目標物。實驗數據最終顯示我們提出的方法能夠提升追蹤目標物的效果。


    Visual tracking has been researched over through years in which one of the most challenging problems is occlusion handling. A tracking target in a complex scene would be occluded by environmental objects such as buildings, people, etc, which often causes the tracking algorithms to lose the track of the target. Different kinds of occlusion handling techniques have been proposed. However, most of these methods are vision-based that search the whole scene and compare the similarities between the target and the potential candidate bounding box. These methods often ignored the depth information of the target and environment objects. The depth information can provide a 3D position of the target, which gives us more information and a better understanding of the moving direction of the target. In this thesis, a visual tracking algorithm with depth-based occlusion handling ability to help us track targets has been proposed.
    Our tracker is based on a discriminative model of a siamese network architecture to detect the target position which takes into consideration of both color and depth information of objects in an image. In the occlusion handling part, the trajectory estimation is based on the Kalman filter. We proposed a multi-position handling method that can enhance the estimation accuracy of the search for the target if it is occluded and becomes losing track.
    The final results show that the average improvement on the F1-score is 5.2 percent against a baseline model Dephtrack Tracker (DeT) over the benchmark depthtrack dataset, therefore our proposed method can improve the performance of the RGB-D visual object tracking.

    摘要 Abstract Acknowledgement List of Tables List of Figures 1 Introduction--------------- 1 2 Related Work--------------- 5 3 Methodology---------------- 11 4 Experiments and Results---- 34 5 Conclusion and Future Work- 43 References------------------- 46

    [1] Yang Li and Jianke Zhu. A scale adaptive kernel correlation filter tracker with feature
    integration. In European conference on computer vision, pages 254–265. Springer,
    2014.
    [2] Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey. Learning background-
    aware correlation filters for visual tracking. In Proceedings of the IEEE international
    conference on computer vision, pages 1135–1143, 2017.
    [3] Shuran Song and Jianxiong Xiao. Tracking revisited using rgbd camera: Unified
    benchmark and baselines. In Proceedings of the IEEE international conference on
    computer vision, pages 233–240, 2013.
    [4] Alan Lukezic, Ugur Kart, Jani Kapyla, Ahmed Durmush, Joni-Kristian Kamarainen,
    Jiri Matas, and Matej Kristan. Cdtb: A color and depth visual object tracking dataset
    and benchmark. In Proceedings of the IEEE/CVF International Conference on Com-
    puter Vision, pages 10013–10022, 2019.
    [5] Song Yan, Jinyu Yang, Jani K ̈apyl ̈a, Feng Zheng, Ales Leonardis, and Joni-Kristian
    K ̈am ̈ar ̈ainen. Depthtrack : Unveiling the power of RGBD tracking. CoRR,
    abs/2108.13962, 2021. URL https://arxiv.org/abs/2108.13962.
    [6] Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Learning dis-
    criminative model prediction for tracking. CoRR, abs/1904.07220, 2019. URL
    http://arxiv.org/abs/1904.07220.
    [7] Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. ATOM:
    accurate tracking by overlap maximization. CoRR, abs/1811.07628, 2018. URL
    http://arxiv.org/abs/1811.07628.
    [8] Greg Welch, Gary Bishop, et al. An introduction to the kalman filter. 1995.
    [9] Zhaoxia Fu and Yan Han. Centroid weighted kalman filter for visual object track-
    ing. Measurement, 45(4):650–655, 2012. ISSN 0263-2241. doi: https://doi.org/
    10.1016/j.measurement.2012.01.004. URL https://www.sciencedirect.
    com/science/article/pii/S026322411200005X.
    [10] Zebin Cai, Zhenghui Gu, Zhu Yu, Hao Liu, and Ke Zhang. A real-time visual object
    tracking system based on kalman filter and mb-lbp feature matching. Multimedia
    Tools and Applications, 75, 12 2014. doi: 10.1007/s11042-014-2411-6.
    [11] Dorin Comaniciu and Visvanathan Ramesh. Mean shift and optimal prediction for
    efficient object tracking. In Proceedings 2000 International Conference on Image
    Processing (Cat. No. 00CH37101), volume 3, pages 70–73. IEEE, 2000.
    [12] Huiyu Zhou, Yuan Yuan, and Chunmei Shi. Object tracking using sift features and
    mean shift. Computer vision and image understanding, 113(3):345–352, 2009.
    [13] Changjiang Yang, R. Duraiswami, and L. Davis. Fast multiple object tracking via
    a hierarchical particle filter. In Tenth IEEE International Conference on Computer
    Vision (ICCV’05) Volume 1, volume 1, pages 212–219 Vol. 1, 2005. doi: 10.1109/
    ICCV.2005.95.
    [14] Zulfiqar Hasan Khan, Irene Yu-Hua Gu, and Andrew G. Backhouse. Robust visual
    object tracking using multi-mode anisotropic mean shift and particle filters. IEEE
    Transactions on Circuits and Systems for Video Technology, 21(1):74–87, 2011. doi:
    10.1109/TCSVT.2011.2106253.
    [15] Irene Anindaputri Iswanto and Bin Li. Visual object tracking based on mean-shift and
    particle-kalman filter. Procedia computer science, 116:587–595, 2017.
    [16] Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. Con-
    volutional features for correlation filter based visual tracking. In Proceedings of the
    IEEE International Conference on Computer Vision (ICCV) Workshops, December
    2015.
    [17] Alan Lukezic, Tomas Vojir, Luka ˇCehovin Zajc, Jiri Matas, and Matej Kristan.
    Discriminative correlation filter with channel and spatial reliability. In Proceedings of
    the IEEE conference on computer vision and pattern recognition, pages 6309–6318,
    2017.
    [18] Matthias Mueller, Neil Smith, and Bernard Ghanem. Context-aware correlation filter
    tracking. In Proceedings of the IEEE conference on computer vision and pattern
    recognition, pages 1396–1404, 2017.
    [19] Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS
    Torr. Fully-convolutional siamese networks for object tracking. In European confer-
    ence on computer vision, pages 850–865. Springer, 2016.
    [20] Qing Guo, Wei Feng, Ce Zhou, Rui Huang, Liang Wan, and Song Wang. Learning
    dynamic siamese network for visual object tracking. In Proceedings of the IEEE
    international conference on computer vision, pages 1763–1771, 2017.
    [21] Anfeng He, Chong Luo, Xinmei Tian, and Wenjun Zeng. A twofold siamese network
    for real-time object tracking. In Proceedings of the IEEE Conference on Computer
    Vision and Pattern Recognition (CVPR), June 2018.
    [22] Xingping Dong and Jianbing Shen. Triplet loss in siamese network for object track-
    ing. In Proceedings of the European conference on computer vision (ECCV), pages
    459–474, 2018.
    [23] Bin Yan, Houwen Peng, Jianlong Fu, Dong Wang, and Huchuan Lu. Learning spatio-
    temporal transformer for visual tracking. In Proceedings of the IEEE/CVF Interna-
    tional Conference on Computer Vision, pages 10448–10457, 2021.
    [24] Feng Xiao, Qiuxia Wu, and Han Huang. Single-scale siamese network based rgb-d
    object tracking with adaptive bounding boxes. Neurocomputing, 451:192–204, 2021.
    [25] Luca Bertinetto, Jack Valmadre, Joo F. Henriques, Andrea Vedaldi, and Philip H. S.
    Torr. Fully-convolutional siamese networks for object tracking, 2016. URL https:
    //arxiv.org/abs/1606.09549.
    [26] Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. Acquisition of
    localization confidence for accurate object detection. In Proceedings of the European
    conference on computer vision (ECCV), pages 784–799, 2018.
    [27] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
    for image recognition. In Proceedings of the IEEE conference on computer vision
    and pattern recognition, pages 770–778, 2016.
    [28] Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. Acquisition of
    localization confidence for accurate object detection. In Proceedings of the European
    conference on computer vision (ECCV), pages 784–799, 2018.
    [29] Matej Kristan, Jiri Matas, Aleˇs Leonardis, Tomas Vojir, Roman Pflugfelder, Gustavo
    Fernandez, Georg Nebehay, Fatih Porikli, and Luka ˇCehovin. A novel performance
    evaluation methodology for single-target trackers. IEEE Transactions on Pattern
    Analysis and Machine Intelligence, 38(11):2137–2155, Nov 2016. ISSN 0162-8828.
    doi: 10.1109/TPAMI.2016.2516982.

    QR CODE