研究生: |
余償鑫 Yu, Chang-Hsin |
---|---|
論文名稱: |
Robust Affinity Propagation for Picture Clustering 強健式親和性互動方法在圖像分群上的研究 |
指導教授: |
王家祥
Wang, Jia-Shung |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2010 |
畢業學年度: | 98 |
語文別: | 英文 |
論文頁數: | 57 |
中文關鍵詞: | 圖片分群 、親和性互動 、照片分群 、圖像分群 |
外文關鍵詞: | affinity propagation, picture clustering, image clustering, photo clustering |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,人們使用數位相機拍照的情形越來越普遍,加上方便且便宜的特性,往往在不知不覺中拍下了許多數位相片。為了方便管理的目的,這些大量的相片應該要有良好的分類。但是,利用人工的方式比對且分類這些相片是非常困難、繁瑣的事情。基於上述的原因,如何利用計算機有效且自動的分群為數眾多及雜亂的圖片是一個很重要的研究挑戰。針對這個課題,在本篇論文中,我們提出了一個基於內容的圖像分群技術方法。我們的方法可以分成三個階段。第一階段中,我們使用SIFT feature以及MPEG-7 CLD feature來表示每一張要分類的圖片。SIFT feature可以有效地描述一張圖片的局部特徵,但是卻沒有考慮到顏色資訊,所以論文中我們使用MPEG-7 CLD feature來補足這個缺陷。第二階段中,我們利用了Affinity Propagation algorithm來當作主要的圖片分群演算法。此外,我們藉由加入了估計AP起始設定的步驟,而改善了AP的不穩定性。第三階段主要是針對第二階段的結果作後處理,我們將一些較小且相似的群合併起來。由本篇論文實驗結果顯示,對於為數1000張的圖片,我們的方法可以達到80%的ARI準確分數。當圖片數目擴大到3000張時,我們的方法仍然有70%的ARI準確分數。平均上而言,和不作任何修改的AP分群結果的ARI準確分數相比,我們所提出的方法增進了54%的幅度。總觀所有實驗結果,顯示出本篇論文提出的圖像分群方法是有效且強健的。
In recent years, taking pictures with digital camera becomes more and more popular. Unlike traditional photos, the cost of taking a digital photo is nearly free, so users often have a great number of digital pictures. For the purpose of management, those pictures should be well categorized. However, grouping a lot of pictures by hand is a difficult and boring task. As a result, how to use computer to automatically group numerous and chaotic digit pictures efficiently is an important research challenge. In this thesis, we propose a content-based picture clustering method for this topic. The proposed method can be separated into three phases. First, phase I extracts local SIFT (Scale Invariant Feature Transformation) features and global MPEG-7 CLD (Color Layout Descriptor) features from all input pictures. SIFT features can describe distinctive local characteristics of an image excluding the color information. Then, we add the color feature to compensate the problem of SIFT. In phase II, we adopt the Affinity Propagation (AP) algorithm as our image clustering method. Further, we improve the instability by appending an estimating step that can evaluate a more suitable initial setting of AP. Finally, phase III is the post-processing stage that merges those small and similar groups produced in phase II. The experimental results show that the proposed method has over 80% ARI accuracy score for 1000 pictures. When the size of the dataset expands to 3000 pictures, the ARI accuracy score is still 70%. On average, our proposed method has 54% improvement in terms of ARI score, as compared to the pure AP algorithm. All the experimental results show that the proposed clustering algorithm is effective and robust.
[1]. Y. Jing and S. Baluja, “VisualRank: Applying PageRank to Large-Scale Image Search,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1877-1890, 2008.
[2]. Y.T. Zheng, M. Z., Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, H. Neven, “Tour the world: Building a web-scale landmark recognition engine,” in Proc. IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1085-1092, 2009.
[3]. A. Gomi, R. Miyazaki, T. Itoh, and L. Jia, “CAT: A Hierarchical Image Browser Using a Rectangle Packing Technique,” in Proc. International Conference on Information Visualisation, pp. 82- 87, 2008.
[4]. D. Cai, X. He, Z. Li, W.Y. Ma, and J.R. Wen, “Hierarchical clustering of WWW image search results using visual, textual and link information,” in Proc. ACM International Conference on Multimedia, pp. 952-959, 2004.
[5]. T. Liu, C. Rosenberg, and H. A. Rowley, “Clustering Billions of Images with Large Scale Nearest Neighbor Search,” in Proc. IEEE Workshop on Applications of Computer Vision, pp. 28, 2007.
[6]. B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, Vol. 315, No. 5814, pp. 1136800-976, 2007.
[7]. B.J. Frey and D. Dueck, “Non-metric affinity propagation for unsupervised image categorization,” in Proc. International Conference on Computer Vision, pp. 1-8, 2007.
[8]. N. Batalas, C. Diou, and A. Delopoulos, “Efficient Indexing, Color Descriptors and Browsing in Image Databases,” in Proc. International Workshop on Semantic Media Adaptation and Personalization, pp. 129-134, 2006.
[9]. P. Liu, K. Jia, Z. Wang, and Z. Lv, “A New and Effective Image Retrieval Method Based on Combined Features,” in Proc. International Conference on Image and Graphics, pp. 786-790, 2007.
[10]. D.K. Park, Y.S. Jeon, C.S. Won, S.J. Park, and S.J. Yoo, “A composite histogram for image retrieval,” in Proc. IEEE International Conference on Multimedia and Expo, vol. 1, pp. 355- 358, 2000.
[11]. J.A. Walter, D. Webling, K. Essig, and H. Ritter, “Interactive Hyperbolic Image Browsing - Towards an Integrated Multimedia Navigator,” in Proc. ACM SIGKDD International Conference, pp. 111-118, 2006.
[12]. Mpeg-7 standards, http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm
[13]. D.G. Lowe, “Distinctive image features from scale-invariant key-points,” in Proc. International Journal of Computer Vision, pp. 91-110, 2004.
[14]. SIFT feature demo code, http://www.cs.ubc.ca/~lowe/keypoints/
[15]. Y. Ke and R. Sukthankar, “PCA-SIFT: A more distinctive representation for local image descriptors,” in Proc. IEEE Computer Conference on Computer Vision and Pattern Recognition, pp. 506-513, 2004.
[16]. A.B. Dahl and H. Aanaes, “Effective image database search via dimensionality reduction,” in Proc. IEEE Computer Conference on Computer Vision and Pattern Recognition, pp. 1-6, 2008.
[17]. K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.
[18]. J.H. Park, K.W. Park, S.H. Baeg, and M.H. Baeg, “π-SIFT: A Photometric and Scale Invariant Feature Transform,” in Proc. International Conference on Pattern Recognition, pp. 1-4, 2008.
[19]. A.E. Abdel-Hakim and A.A. Farag, “CSIFT: A SIFT descriptor with color invariant characteristics,” in Proc. IEEE Computer Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 978-1983, 2006.
[20]. H. Yang and Q. Wang, “A novel local feature descriptor for image matching,” in Proc. IEEE International Conference on Multimedia and Expo, pp. 1405-1408, 2008.
[21]. D.R. Kisku, A. Rattani, E. Grosso, and M. Tistarelli, “Face identification by SIFT-based complete graph topology,” in Proc. IEEE Workshop on Automatic Identification Advanced Technologies, pp. 63-68, 2007.
[22]. S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” in Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 2169-2178, 2006.
[23]. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, p.1-8, 2007.
[24]. L. Hubert and P. Arabie, “Comparing partitions,” Journal of Classification, vol. 2, no. 1, pp.193–218, 1985.
[25]. N.X. Vinh, J. Epps, and J. Bailey, “Information Theoretic Measures for Clustering Comparison: Is a Correction for Chance Necessary?” in Proc. ACM International Conference on Machine Learning, pp. 1073-1080, 2009.
[26]. D.L. Davies and D.W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224-227, 1979.
[27]. D. Nister and H. Stewenius, “Scalable Recognition with a Vocabulary Tree,” in Proc. IEEE Computer Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161-2168, 2006.
[28]. UK benchmark database, http://vis.uky.edu/~stewe/ukbench/
[29]. Caltech-101 database, http://www.vision.caltech.edu/Image_Datasets/Caltech101/
[30]. Oxford building database, http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/index.html