簡易檢索 / 詳目顯示

研究生: 鄭智元
Cheng, Chih Yuan
論文名稱: 基於社交資訊輔助之社群相簿人臉分群
Social Context Assisted Face Clustering for Social Group Photo Albums
指導教授: 林嘉文
Lin, Chia Wen
口試委員: 蔡文錦
Tsai , W. J.
王家慶
WANG,JIA-CHING
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2016
畢業學年度: 105
語文別: 英文
論文頁數: 34
中文關鍵詞: 人臉分群社交資訊
外文關鍵詞: Face clustering, Social information
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在現今社群媒體以及個人的雲端儲存空間蓬勃發展下,每個人都能輕易的上傳影音資料。根據統計資料,一天在社群平台-臉書上傳的圖片數量可達到三億張。這麼大量的圖片上傳使得如何有效管理及搜尋個人相簿變成重要的課題。其中人臉分群為其中一樣重要的技術。人臉分群為一種對於尚未經由人力標注的人臉資料庫做分群的技術。即使在過去人臉辨識上已研究出好的特徵來描述人臉,但人臉分群最大的困難點在於即使是同一人的臉也會因表情、光線、角度、遮蔽等等因素使得特徵的描述變化非常大,而不同人的臉也可能因這些因素相似而使特徵也相似,導致在只基於視覺特徵做分群的結果時常有過度分群的現象(即群數過多,每群的樣本數不多),雖然同一群的精確度(precision)極高,但是召回率(recall)卻是十分低的。然而相簿集除了圖片本身的影像資訊外,尚有人與人是否合照以及相片擷取時間等額外的資訊,這些額外資訊提供一些視覺特徵無法提供的分群線索。例如說同一張相片偵測的人臉必定不是同一人,而同一場景的照片通常會拍幾張且主角皆為同一群人。其中這種人與人之間共同出現與否的關係描述的是人群如何互動的關係,我們稱為社交資訊。
    本論文將挖掘個人相簿中的社交資訊,對只利用視覺資訊的初始分群結果進行改善,而這種利用額外資訊改善後的結果更能送回萃取特徵的步驟使得初始分群能夠同時參考視覺、社交資訊。實驗顯示持續迭代這樣的過程能使結果更為準確


    With the development of social media and personal cloud service, everyone can upload their media data easily. According to statistics, the number of photos uploaded to Facebook can reach 300 million. With that, how to manage the personal photo album efficiently has becomes an issue. Face clustering plays as an important role in it. Face clustering is a technique to perform clustering on an unlabeled face dataset. Although face recognition has been well developed in the past decades that good representations can well describe faces. The difficulty of face clustering is that facial features may be very different for the same person or features maybe similar for different person due to various expression, lightning, poses. Face clustering only depends on visual feature often over-clustered (clusters with high precision but very low recall). Other than visual information of image itself, personal album contains other metadata like: whether faces are taken in same photo (co-occurrence); when the photo was taken ..., this additional information may lead to a better cluster result that visual information is not capable of. For instance, faces detected in the same photo must not be the same person, while photos in the same scene usually contain the same group of people. We call this kind of information “social information”.
    In this thesis, we will try to mine social information from a personal photo album and improve the purely visual based clustering result. Feeding back the results improved by social information to the feature extraction step, the initial clustering can provide results considering both visual and social information. Our experiments show that the clustering performance is improved by the iterative procedure.

    摘要 4 Abstract 5 Content 6 Chapter 1 Introduction 8 1.1 Motivation & background 8 1.2 Research Objective 9 Chapter 2 Related work 11 2.1 Fully unsupervised face clustering 11 2.2 Semi-supervised face clustering 11 2.3 Event detection and clustering in photo album 13 Chapter 3 Proposed method 14 3.1 Overview 14 3.2 Initial purely visual-based face clustering 14 3.3 Split before merge 15 3.4 Extracting social information 17 3.5 Iteratively merging 19 3.5.1 Mutual nearest neighbor 19 3.5.2 Merging phase 1: Mutual Nearest Neighbor & co-occurrence 20 3.5.3 Merging phase 2: Continue merging refer to common event 21 3.5.4 Fine-tune the Deep Network 22 3.5.5 Put back the unconfident set constrained by co-occurrence 23 Chapter 4 Experiment and Discussion 25 4.1 Dataset and settings 25 4.2 Experimental results 25 4.2.1 Evaluation: merging process 25 4.2.2 Evaluation: fine tune after merging 27 4.2.3 Clustering results 29 Chapter 5 Conclusion 31 References 32

    [1] C. Zhu, F. Wen, and J. Sun, “A rank-order distance based-clustering algorithm for face tagging,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011, pp. 481-488.
    [2] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks,” Proc. Neural Information and Processing Systems (NIPS), 2012, pp. 1106-1114
    [3] Y. Song and T. Leung, “Context-aided human recognition- clustering,” Proc. 9th European Conf. on Computer Vision (ECCV), 2006, pp. 382-395.
    [4] A. Gallagher, T. Chen, “Clothing Co-segmentation for Recognizing People”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2008.
    [5] P. Wu and F. Tang, “Improving Face Clustering Using Social Context”, Proc. ACM Int. Conference on Multimedia, 2010, Firenze, Italy, pp. 25-29.
    [6] K. Simonyan, A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv technical report, 2014.
    [7] J. C. Platt, M. Czerwinski, B. Field, “Phototoc: automatic clustering for browsing personal photographs,” in Proc. of Fourth Pacific Rim Conference on Multimedia, volume 1, pages 6-10 Vol.1, 2003.
    [8] P. Sinha, H. Pirsiavash, and R. Jain, “Personal Photo Album Summarization,” Proc. ACM Int. Conference on Multimedia, 2009, Beijing, China, pp. 19-24.
    [9] P. Obrador, R. de Oliveira, N. Oliver, “Supporting personal photo storytelling for social albums,” Proc. ACM Int. Conference on Multimedia, 2011, New York, USA, pp. 561-570.
    [10] L. Zhang, D. V. Kalashnikov, S. Mehrotra, “A Unified Framework for Context Assisted Face Clustering,” Proc. Int. Conf. on Multimedia Retrieval, 2013, Dallas, Texas, USA, pp. 16-20.
    [11] G. Wang, A. Gallagher, J. Luo, and D. Forsyth, “Seeing People in Social Context: Recognizing People and Social Relationships,” Proc. 11th European Conf. on Computer Vision (ECCV), pp. 169-182, 2010.
    [12] M.-C. Yeh and W.-P. Wu, “Clustering faces in movies using an automatically constructed social network,” IEEE MultiMedia, vol. 21, no. 2, pp. 22-31, Apr.-June 2014.
    [13] P. Wu, Q. Fu, and F. Tang, “Social Community Detection from Photo Collections Using Bayesian Overlapping Subspace Clustering,” Proc. 17th Int. Conf. on Multimedia Modeling (MMM), 2011, Taipei, Taiwan, pp. 57-64
    [14] A. C. Gallagher, T. Chen, “Using Group Prior to Identify People in Consumer Images,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2007, Minneapolis, MN, pp. 1-8.
    [15] V. Paul, J. Michael. Rapid Object Detection using a Boosted Cascade of Simple Features. Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Kauai, Hawaii, USA, 2001.
    [16] B. J. Frey, D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, pp. 972–976, 2007
    [17] A. Fitzgibbon and A. Zisserman, “On affine invariant clustering and automatic cast listing in movies,” Proc. 7th European Conf. on Computer Vision (ECCV), 2002, pp. 304–320, Springer.
    [18] S. Xiao, M.Tan, and D. Xu, “Weighted Block-Sparse Low Rank Representation for Face Clustering in Videos,” Proc. 13th European Conf. on Computer Vision (ECCV) 2014
    [19] R. G. Cinbis, J.Verbeek, C. Schmid, “Unsupervised Metric Learning for Face Identification in TV Video,” Proc. IEEE Int. Conf. on Computer Vision (ICCV), 2011, Barcelona, Spain, pp. 1559-1556.
    [20] P. Wu, D. Tretter, “Close & Closer: Social Cluster and Closeness from Photo Collections,” Proc. of ACM Int. Conf. of Multimedia, 2009.
    [21] http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE