簡易檢索 / 詳目顯示

研究生: 施亦宣
Shih, Yi-Hsuan
論文名稱: 以最適鄰近集之非參數場景剖析技術應用於偵測十字路口之車輛
Vehicles Detection at Urban Intersections via Adaptive Neighbor Sets of Nonparametric Scene Parsing
指導教授: 王家祥
Wang, Jia-Shung
口試委員: 葉梅珍
Yeh, Mei-Chen
陳煥宗
Chen, Hwann-Tzong
學位類別: 碩士
Master
系所名稱:
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 51
中文關鍵詞: 物件辨識影像分割影像剖析場景辨認上下文索引車輛偵測電腦視覺
外文關鍵詞: Object recognition, Image segmentation, Image parsing, Scene understanding, Context indexes, Vehicles detection, Computer vision
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 經由許多實驗顯示,十字路口的車輛偵測面臨到的挑戰包含需在車輛進入畫面的第一張影像,手動給予車輛邊框。以及在車輛追蹤的過程當中,目標車輛可能會遺失而導致追蹤失敗。因此,本篇論文的具體目標是希望針對以上問題探索可行的解決方法。
    我們將非參數場景剖析的技術應用於偵測十字路口的車輛,旨在車輛進入畫面的第一張影像即可自動偵測車輛與機車的物件,而不必手動給予邊框。此外,經由場景剖析後的標註結果也可以改善在追蹤過程中遺失的目標物。
    近年來,有很多專家研究有關於非參數場景剖析的技術。此技術可從學習資料集中轉移標籤至測試影像。藉由參考[5]的作法,本篇論文首先將影像分割成超像素級,藉由特徵計算,從學習資料集中擷取相似的影像作為檢索集。除此之外,我們採用交叉策略的方式來學習資料集中每張影像的權重,以達到降低分類錯誤的比率。另外,為了可以提高對於罕見類別的辨識率,我們從學習資料集當中計算出每個分割片段的上下文語意關係,將最接近測試影像的罕見類別片段加入至檢索集內。最後,透過馬可夫隨機場模型計算能量公式達到標註影像的目標。因應本實驗的測試資料集為十字路口場景,我們利用背景相消的方式可以有效達到降低分類錯誤的比率。
    我們可以從實驗結果中發現,結合非參數場景剖析的技術與背景相消的處理可以有效地解決十字路口車輛偵測的問題。


    The challenges faced by many experiments with the vehicles detection at the urban intersections are that a bounding box is manually given to circle out the target object in the first frame and that the lost target object during a procedure of tracking might lead to tracking error. Hence, the specific objective of this thesis is to explore some solutions to these problems.
    We apply the nonparametric scene parsing method to the vehicles detection at the urban intersections to automatically find out the car and motorcycle objects in the first frame without manually giving a bounding box. Moreover, the annotation results of scene parsing can improve the lost object.
    Many researches about the nonparametric scene parsing have been studied currently. The nonparametric scene parsing is a method to annotate a query image by transferring labels from the training data set. Referring to the method of [5], our proposed method firstly segments the images into superpixels. By means of calculating features, we can extract similar image set as the retrieval set from the training data set. In addition, we learn weights for each image in the training data set to minimize classification error using a leave-one-out strategy. In order to boost the classification of rare classes, we compute the semantic context of segments in the training data set and add the nearest rare class examples into the retrieval set. Finally, we compute the energy function in Markov Random Field (MRF) to label the query image. Since the scene of urban intersections is our main testing data set, we use background subtraction to extract foregrounds so as to reduce classification error.
    Our experimental results show that combination with the nonparametric scene parsing and background subtraction can effectively solve the problems of the vehicles detection at the urban intersections.

    致謝 I 中文摘要 II ABSTRACT IV CONTENTS VI LIST OF FIGURES VIII LIST OF TABLES X Chapter 1. Introduction 1 Chapter 2. Related Works 5 2.1 Segmentation 5 2.2 Retrieval Set 7 2.3 Markov Random Field (MRF) 9 Chapter 3. Proposed Methods 12 3.1 Retrieval set 13 3.1.1 Learning Weights 14 3.1.2 Context Index 16 3.2 Markov Random Field (MRF) Framework 18 3.3 Background Subtraction 21 Chapter 4. Experimental Results 24 4.1 Database and Parameters Setting 24 4.2 Campus photos in NTHU 24 4.3 Urban intersections 28 4.3.1 Data set of Guangfu Rd. and Shuiyuan St. 28 4.3.2 Data set of Guangfu Rd. and Jiangong Rd. 41 Chapter 5. Conclusion and Future Work 47 REFERENCES 49

    [1] A. Asghar and N. I. Rao, "Semantics Sensitive Segmentation and Annotation of Natural Images," in 2008 IEEE International Conference on Signal Image Technology and Internet Based Systems, 2008, pp. 387-394.
    [2] Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 1222-1239, 2001.
    [3] S. Di, H. Zhang, X. Mei, D. Prokhorov, and H. Ling, "Spatial Prior for Nonparametric Road Scene Parsing," in 2015 IEEE 18th International Conference on Intelligent Transportation Systems, 2015, pp. 1209-1214.
    [4] C. Domeniconi, P. Jing, and D. Gunopulos, "Locally adaptive metric nearest-neighbor classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 1281-1285, 2002.
    [5] D. Eigen and R. Fergus, "Nonparametric image parsing using adaptive neighbor sets," in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 2799-2806.
    [6] C. Galleguillos, A. Rabinovich, and S. Belongie, "Object categorization using co-occurrence, location and appearance," in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1-8.
    [7] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in 2006 IEEE computer society conference on Computer Vision and Pattern Recognition 2006, pp. 2169-2178.
    [8] C. Liu, J. Yuen, and A. Torralba, "Nonparametric scene parsing: Label transfer via dense scene alignment," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 1972-1979.
    [9] C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. Freeman, "Sift flow: Dense correspondence across different scenes," Computer vision–ECCV 2008, pp. 28-42, 2008.
    [10] M. Najafi, S. T. Namin, M. Salzmann, and L. Petersson, "Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering," in 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 607-615.
    [11] T. V. Nguyen, C. Lu, J. Sepulveda, and S. Yan, "Adaptive Nonparametric Image Parsing," IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, pp. 1565-1575, 2015.
    [12] A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International Journal of Computer Vision, vol. 42, pp. 145-175, 2001.
    [13] Y. A. Sheikh, E. A. Khan, and T. Kanade, "Mode-seeking by Medoidshifts," in 2007 IEEE 11th International Conference on Computer Vision, 2007, pp. 1-8.
    [14] G. Singh and J. Kosecka, "Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3151-3157.
    [15] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, et al., "A comparative study of energy minimization methods for markov random fields," Computer Vision–ECCV 2006, pp. 16-29, 2006.
    [16] J. Tighe and S. Lazebnik, "Superparsing: scalable nonparametric image parsing with superpixels," Computer Vision–ECCV 2010, pp. 352-365, 2010.
    [17] H. Xuming, R. S. Zemel, and M. A. Carreira-Perpinan, "Multiscale conditional random fields for image labeling," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, pp. II-695-II-702 Vol.2.
    [18] H. Zhang, T. Fang, X. Chen, Q. Zhao, and L. Quan, "Partial similarity based nonparametric scene parsing in certain environment," in 2011 IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 2241-2248.
    [19] H. Zhang, J. Xiao, and L. Quan, "Supervised label transfer for semantic segmentation of street scenes," Computer Vision–ECCV 2010, pp. 561-574, 2010.

    QR CODE