研究生: |
李怡靜 Lee, Yi-Ching |
---|---|
論文名稱: |
以部位為基礎的協同表示方式進行視覺細分類 Collaborative Representation with Part Segmentation for Fine-Grained Visual Categorization |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: |
劉庭祿
陳煥宗 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | 視覺細分類 |
外文關鍵詞: | fine-grained visual categorization |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
細部視覺分類是在影像分類問題中的一種特殊情況。這個問題之所以具有挑戰性的原因是來由於物體資訊本身存在因視角、姿勢、光照程度而造成組間的變異量小且組內變異量大的情況。為了提高分類的正確率,我們加入物體細部的位置資訊並提出一個以部位資訊為基礎的細分類流程來解決細部視覺分類的問題。
我們提出的方法包括以下幾個步驟: 首先,去除背景區域,只保留包含物體的前景區域,藉此我們可以減少因背景造成的分類干擾。第二,利用前景區域和部位資訊推測出各部位的區域分割,藉由這個區域分割的輔助,我們可以做到類似姿勢校正的效果。第三,針對各部分的區域分割分別萃取特徵,再經過特徵編碼以得到最終的照像特徵。最後我們從訓練資料中計算出類別之間的協同表示方式並一般化的最小平方誤差來進行分類。
Fine-grained visual categorization is a special case in image classification. It is a challenging task in which objects may have small between-class variation and large intra-class variation caused by viewpoints, pose and lighting condition changes. In order to improve the performance of classification, we incorporate the part information of objects and propose a part-based classification framework for fine-grained visual categorization.
The proposed classification framework consists of the following steps: First, we infer the part segmentation from foreground regions and part locations of the object. With the inferred part segmentation, we implicitly perform pose normalization on the object. Then, we extract features from the corresponding part segments and apply feature encoding to generate the final image representation. Finally, we perform image classification based on their collaborative representation with regularized least squares from the whole training data.
[1] Sfar, A.R., Boujemaa, N., Geman, D.: Vantage feature frames for fine-grained categorization. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 835-842
[2] Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Computer Vision, Graphics & Image Processing, 2008. ICVGIP'08. Sixth Indian Conference on, IEEE (2008) 722-729
[3] Wang, J., Markert, K., Everingham, M.: Learning models for object recognition from natural language descriptions. In: Proceedings of the British Machine Vision Conference. (2009)
[4] Liu, J., Kanazawa, A., Jacobs, D., Belhumeur, P.: Dog breed classification using part localization. In: Computer Vision-ECCV 2012. Springer (2012) 172-185
[5] Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.: Cats and dogs. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE (2012) 3498-3505
[6] Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Computer Vision-ECCV 2010. Springer (2010) 438-451
[7] Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE (2012) 3665-3672
[8] Berg, T., Belhumeur, P.N.: Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 955-962
[9] Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: Computer Vision (ICCV), 2013 IEEE International Conference on, IEEE (2013) 321-328
[10] Gavves, E., Fernando, B., Snoek, C.G., Smeulders, A.W., Tuytelaars, T.: Finegrained categorization by alignments. In: Computer Vision (ICCV), 2013 IEEE International Conference on, IEEE (2013) 1713-1720
[11] Bo, L., Ren, X., Fox, D.: Multipath sparse coding using hierarchical matching pursuit. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 660-667
[12] Xie, L., Tian, Q., Hong, R., Yan, S., Zhang, B.: Hierarchical part matching for fine-grained visual categorization. In: Computer Vision (ICCV), 2013 IEEE International Conference on, IEEE (2013) 1641-1648
[13] Xie, L., Tian, Q., Zhang, B.: Spatial pooling of heterogeneous features for image applications. In: Proceedings of the 20th ACM international conference on Multimedia, ACM (2012) 539-548
[14] Sanchez, J., Perronnin, F., Akata, Z., et al.: Fisher vectors for fine-grained visual categorization. In: FGVC Workshop in IEEE Computer Vision and Pattern Recognition (CVPR). (2011)
[15] Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: Computer Vision-ECCV 2012. Springer (2012) 430-443
[16] Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L., Zisserman, A.: Tricos: A tri-level class-discriminative co-segmentation method for image classification. In: Computer Vision-ECCV 2012. Springer (2012) 794-807
[17] Angelova, A., Zhu, S.: Efficient object detection and segmentation for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 811-818
[18] Farrell, R., Oza, O., Zhang, N., Morariu, V.I., Darrell, T., Davis, L.S.: Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: Computer Vision (ICCV), 2011 IEEE International Conference on, IEEE (2011) 161-168
[19] Deng, J., Krause, J., Fei-Fei, L.: Fine-grained crowdsourcing for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 580-587
[20] Zhang, D., Yang, M., Feng, X.: Sparse representation or collaborative representation: Which helps face recognition? In: Computer Vision (ICCV), 2011 IEEE International Conference on, IEEE (2011) 471-478
[21] Yang, M., Zhang, D., Yang, J.: Robust sparse coding for face recognition. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE (2011) 625-632
[22] Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology (2011)
[23] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. Pattern Analysis and Machine Intelligence, IEEE Transactions on 34 (2012) 2274-2282
[24] Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the international conference on Multimedia, ACM (2010) 1469-1472
[25] Van De Sande, K.E., Gevers, T., Snoek, C.G.: Evaluating color descriptors for object and scene recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on 32 (2010) 1582-1596
[26] Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, IEEE (2010) 3360–3367
[27] Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (2011) 27:1{27:27 Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.