Interactive Object Segmentation and Recognition｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	謝郁志 Yu-Chih Hsieh
論文名稱：	Interactive Object Segmentation and Recognition
指導教授：	陳煥宗 Hwann-Tzong Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2008
畢業學年度：	96
語文別：	英文
論文頁數：	41
中文關鍵詞：	物件切割、物件辨識
外文關鍵詞：	segmentation, recognition, texton
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

利用電腦切割及辨識影像中的物件，一直是電腦視覺領域中，兩個困難且具有挑戰性的問題。針對這兩大問題，已經有很多人提出各式各樣的方法來試圖解決。然而，多數的方法，都只是將這兩類問題各自獨立出來解決。在這篇論文中，我們實作一個互動式物件切割與辨識的系統，同時解決上述兩個問題！

這篇論文最主要的概念，是從拼圖而來的。首先，我們將樣本影像，利用有效率的演算法，切成很多不規則的碎片。將這些碎片，利用一些共通元素的分佈來表示。利用這些碎片的共通元素分佈，當成碎片的特徵，來學習分類器。在分類的時候，將測試的影像同樣切成很多的碎片，對每一塊碎片利用我們的分類器來分類取得物件類別；分類完成之後，物件切割也自然完成了。在分類結束之後，有些碎片仍然會被分成錯誤的類別。將這些碎片，利用正確分類的碎片所提供類別和影像中的相對位置的資訊，來修正成正確的類別。

我們的系統，對於像是草地、樹葉、天空等紋理的影像或物件，有不錯的效果。對於主題明顯的影像，效果也表現得不錯。在最後的修正過程，也確實能正確地運作表現。

Abstract

Object recognition and segmentation are two of the most difficult and challenging problems in computer vision. Many approaches have been proposed to solve these problems. However, these two problems are usually addressed independently in previous approaches. We implement an interactive object segmentation and recognition system to solve these two problems at the same time.

The main idea of this thesis comes from jigsaws. First, we over-segment every input image into many superpixels. Then we classify all superpixels based on the bag-of-words model. Finally, we combine the superpixels that belong to the same labels to obtain segmented objects. After the classification stage, there are still some misclassified superpixels. Based on the information derived from user scribbles, we can refine the misclassified superpixels interactively. The information we used in the refinement process is the spatial and co-occurrence relations between objects of different categories.

Our object segmentation system favors texture objects, such as grass, tree, sky, and so on. Images with a significant subject can also have good labeling accuracy. The experimental results show that the refinement process indeed helps to improve the preliminary segmentation results.

Contents

Introduction 7

Interactive Framework 9
1 Framework Overview  9
2 Codebook Construction 11
2.1 Previous Approach to Codebook Construction 12
2.2 Drawbacks and Modications 13
2.3 Implementation Issues 16
2.4 Discussion 18
3 The Learning Phase 18
3.1 A Standard Learning Process 18
3.2 Drawbacks and Modifications 19
3.3 Implementation Issues 20
3.4 From Bag-of-Words to SVM 20
3.5 Discussion 22
4 Interactive Refinement23
4.1 Overview 23
4.2 Category-Cooccurance-Location Matrix 24
4.3 Refinement 25

Experiments 27
1 The Data Set 27
2 Filter Responses 27
3 Experimental Results 29
3.1 Superpixel Classification29
3.2 Object Segmentation 32
4 Interactive Refinement 32
5 Discussion 34
5.1 Observation 35

Conclusion and Future Work 39

                                

Bibliography

[1] Dhruv Batra, Rahul Sukthankar, and Tsuhan Chen. Learning class-specific affinities for image labelling. In CVPR, 2008.

[2] Bernhard E. Boser, Isabelle Guyon, and Vladimir Vapnik. A training algorithm for optimal margin classifiers. In COLT, pages 144-152, 1992.

[3] Leo Breiman. Random forests. Machine Learning, 45(1):5-32, 2001.

[4] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/\widetilde{}cjlin/libsvm.

[5] Ondrej Chum and Andrew Zisserman. An exemplar model for learning object classes. In CVPR, 2007.

[6] Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, and Cédric Bray. Visual categorization with bags of keypoints. In ECCV International Workshop on Statistical Learning in Computer vision, 2004.

[7] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2):167{181, 2004.

[8] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of online learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119-
139, 1997.

[9] Thomas K. Leung and Jitendra Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journalof Computer Vision, 43(1):29-44, 2001.

[10] Frank Moosmann, Bill Triggs, and Frederic Jurie. Fast discriminative visual codebooks using randomized clustering forests. In NIPS, pages 985-992, 2006.

[11] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888-905, 2000.

[12] Jamie Shotton, John M. Winn, Carsten Rother, and Antonio Criminisi. Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In ECCV (1), pages 1-15, 2006.

[13] Josef Sivic and Andrew Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV, pages 1470-1477, 2003.

[14] John M. Winn, Antonio Criminisi, and Thomas P. Minka. Object categorization by learned universal visual dictionary. In ICCV, pages 1800-1807, 2005.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文