簡易檢索 / 詳目顯示

研究生: 黃子淩
HUANG, ZI-LING
論文名稱: 基於風格變化多尺度關係的人群重識別研究
Group Re-identification via Domain- Transferred Multi-Level Relations
指導教授: 林嘉文
Lin, Chia-Wen
口試委員: 胡敏君
Hu, Min-Jun
林彥宇
Lin, Yen-Yu
黃敬群
Huang, Ching-Chun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 31
中文關鍵詞: 人群重識別行人重識別圖網絡
外文關鍵詞: Group Re-identification, Person Re-identification, Graph Neural Network
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人群重識別是一項重要但較少被研究的課題。相較於普通的行人重識別而言,
    由於每位行人都是獨立的個體,他們會在人群中自由變換相對位置,並且會隨意
    進入或者離開這個群體, 給人群重識別帶來了額外的難點。所以,人群重識別主
    要針對這兩個額外的難點提出解決方案。在本人中, 我們提出了兩個方法來解
    決人群重識別任務。
    (1 由於人群重識別中缺少訓練數據,我們將行人重識別中的數據進行風格
    轉換為人群重識別測試數據的風格。而後,我們自適應的無監督方法將人群中單
    人的特征和雙人的關係特征融合進行人群重識別。
    (2 在本方法中,我們把人群中的所有人看成一個圖模型。用風格變化後的
    數據進行人群中相對位置變化和任意人加入離開人群進行進行模擬。組成訓練數
    據。使用模擬數據訓練圖神經網絡。再進行測試。
    實驗表明,我們所提出的方法得到的結果均優於現在的方法。


    Group re-identification (G-ReID) is an important yet less-studied task. Its challenges not only lie in appearance changes of individuals which have been well investigated in general person re-identification (ReID), but also derive from group layout and membership changes. So the key task of G-ReID is to learn representations robust to such changes. To address this issue, we propose a Transferred Single and Couple Representation Learning Network (TSCN) and Domain-Transferred Graph Neural Network (DoT-GNN). Its merits are two aspects: 1) Due to the lack of labelled training samples, existing G-ReID methods mainly rely on unsatisfactory
    hand-crafted features. To gain the superiority of deep learning models, we treat a group as multiple persons and transfer the domain of a labeled ReID dataset to a G-ReID target dataset style to learn single representations. Taking into account the neighborhood relationship in a group, we further propose learning a novel couple
    representation between two group members, that achieves more discriminative power in G-ReID tasks. In addition, an unsupervised weight learning method is exploited to adaptively fuse the results of different views together according to result patterns.
    2) Graph Generation. We treat a group as a graph, where each node denotes the individual feature and each edge represents the relation of a couple of individuals. We propose a graph generation strategy to create sufficient graph samples. Graph Neural Network. Employing the generated graph samples, we train the GNN so as to acquire graph features which are robust to large graph variations. Extensive
    experimental results demonstrate the effectiveness of our approach that significantly outperforms state-of-the-art methods. This graduate thesis is composed by three published papers[1], [2], [3].

    I Introduction 1 II Related works 5 III Domain-Transferred Single and Couple Representation Learning 7 III-A Domain Transfer . . . . . . . . . . . . . . . . . . . . . . . . . 8 III-B Offline Representation Learning . . . . . . . . . . . . . . . . 9 III-C Online Feature Fusion . . . . . . . . . . . . . . . . . . . . . . 11 III-D Experimental Results . . . . . . . . . . . . . . . . . . . . . . 13 III-D1 Datasets and Experiment Setting . . . . . . . . . . 13 III-D2 Performance of Domain-Transferred Representation Learning . . . . . . . . . . . . . . . . . . . . . 15 III-D3 Performance of Couple Representation Learning . 16 III-D4 Performance of Fusion . . . . . . . . . . . . . . . . 17 III-D5 Comparison with State-of-the-art Methods . . . . 17 IV Domain-Transferred Graph Neural Network for Group Re-identification 18 IV-A Proposed Framework . . . . . . . . . . . . . . . . . . . . . . 18 IV-B Domain-Transferred Model . . . . . . . . . . . . . . . . . . . 19 IV-C Graph Generator . . . . . . . . . . . . . . . . . . . . . . . . 19 IV-D GNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 IV-E Testing Step . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 IV-F Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 IV-F1 Datasets and Experimental Setting . . . . . . . . . 23 IV-F2 Performance of Image Domain Transfer . . . . . . 25 IV-F3 Comparison with the State-of-the-art Methods . . 25 IV-F4 The influence of Graph Generator . . . . . . . . . 26 IV-F5 Ablation Study . . . . . . . . . . . . . . . . . . . . 26 IV-F6 Subjective Comparison . . . . . . . . . . . . . . . 27 V Conclusion 28 References 29

    [1] Z. Huang, Z. Wang, W. Hu, C.-W. Lin, and S. Satoh, “Dot-gnn: Domain-transferred graph neural
    network for group re-identification,” in Proceedings of the 27th ACM International Conference on
    Multimedia, 2019, pp. 1888–1896.
    [2] Z. Huang, Z. Wang, T.-Y. Hung, S. Satoh, and C.-W. Lin, “Group re-identification via transferred
    representation and adaptive fusion,” in 2019 IEEE Fifth International Conference on Multimedia
    Big Data (BigMM). IEEE, 2019, pp. 128–132.
    [3] Z. Huang, Z. Wang, S. Satoh, and C.-W. Lin, “Group re-identification via transferred single and
    couple representation learning,” arXiv preprint arXiv:1905.04854, 2019.
    [4] Y.-C. Chen, W.-S. Zheng, and J. Lai, “Mirror representation for modeling view-specific transform
    in person re-identification,” in Inte. Joint Conf. Artificial Intell., 2015, pp. 3402–3408.
    [5] S. Li, M. Shao, and Y. Fu, “Cross-view projective dictionary learning for person re-identification,”
    in Int. Joint Conf. Artificial Intell., 2015, pp. 2155–2161.
    [6] Z. Wang, R. Hu, C. Liang, Y. Yu, J. Jiang, M. Ye, J. Chen, and Q. Leng, “Zero-shot person
    re-identification via cross-view consistency,” IEEE Trans. Multimedia, vol. 18, no. 2, pp. 260–272,
    2016.
    [7] Z. Wang, R. Hu, Y. Yu, J. Jiang, C. Liang, and J. Wang, “Scale-adaptive low-resolution person
    re-identification via learning a discriminating surface,” in Int. Joint Conf. Artificial Intell., 2016,
    pp. 2669–2675.
    [8] W. Li, X. Zhu, and S. Gong, “Person re-identification by deep joint learning of multi-loss
    classification,” in Int. Joint Conf. Artificial Intell., 2017, pp. 2194–2200.
    [9] G. Lisanti, N. Martinel, A. Del Bimbo, and G. Luca Foresti, “Group re-identification via
    unsupervised transfer of sparse features encoding,” in Int. Conf. Comput. Vis., 2017, pp. 2449–2458.
    [10] W. Lin, Y. Li, H. Xiao, S. John, J. Zou, H. Xiong, J. Wang, and M. Tao, “Group re-identification
    with multi-grained matching and integration,” IEEE Transaction on Cybernetics, 2019.
    [11] Z. Zhong, L. Zheng, S. Li, and Y. Yang, “Generalizing a person retrieval model hetero-and
    homogeneously,” in European Conf. Comput. Vis., 2018, pp. 172–188.
    [12] Z. Zhong, L. Zheng, Z. Zheng, S. Li, and Y. Yang, “Camera style adaptation for person reidentification,”
    in IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 5157–5166.
    [13] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in
    ICLR, 2017.
    [14] L. Zheng, S. Wang, L. Tian, F. He, Z. Liu, and Q. Tian, “Query-adaptive late fusion for image
    search and person re-identification,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp.
    1741–1750.
    [15] W. Li, R. Zhao, T. Xiao, and X. Wang, “Deepreid: Deep filter pairing neural network for person
    re-identification,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2014, pp. 152–159.
    [16] T. Xiao, H. Li, W. Ouyang, and X. Wang, “Learning deep feature representations with domain
    guided dropout for person re-identification,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2016,
    pp. 1249–1258.
    [17] C. Su, J. Li, S. Zhang, J. Xing, W. Gao, and Q. Tian, “Pose-driven deep convolutional model for
    person re-identification,” in Int. Conf. Comput. Vis., 2017, pp. 3960–3969.
    [18] S.-Z. Chen, C.-C. Guo, and J.-H. Lai, “Deep ranking for person re-identification via joint
    representation learning,” IEEE Trans. Image Process., vol. 25, no. 5, pp. 2353–2367, 2016.
    [19] F. Zhu, X. Kong, L. Zheng, H. Fu, and Q. Tian, “Part-based deep hashing for large-scale person
    re-identification,” IEEE Trans. Image Process., vol. 26, no. 10, pp. 4806–4817, 2017.
    [20] H. Yao, S. Zhang, R. Hong, Y. Zhang, C. Xu, and Q. Tian, “Deep representation learning with
    part loss for person re-identification,” IEEE Trans. Image Process., 2019.
    [21] L. Zheng, Y. Huang, H. Lu, and Y. Yang, “Pose invariant embedding for deep person reidentification,”
    IEEE Trans. Image Process., 2019.
    [22] Y. Cai, V. Takala, and M. Pietikainen, “Matching groups of people by covariance descriptor,” in
    Int. Conf. Pattern Recognit., 2010, pp. 2744–2747.
    [23] W.-S. Zheng, S. Gong, and T. Xiang, “Associating groups of people.” in British Mach. Vis. Conf.,
    2009.
    [24] F. Zhu, Q. Chu, and N. Yu, “Consistent matching based on boosted salience channels for group
    re-identification,” in IEEE Int. Conf. Image Process., 2016, pp. 4279–4283.
    [25] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,”
    in IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2414–2423.
    [26] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and superresolution,”
    in European Conf. Comput. Vis., 2016, pp. 694–711.
    [27] K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised pixel-level
    domain adaptation with generative adversarial networks,” in IEEE Conf. Comput. Vis. Pattern
    Recognit., 2017, pp. 3722–3731.
    [28] Y. Taigman, A. Polyak, and L. Wolf, “Unsupervised cross-domain image generation,” arXiv preprint
    arXiv:1611.02200, 2016.
    [29] W. Deng, L. Zheng, Q. Ye, G. Kang, Y. Yang, and J. Jiao, “Image-image domain adaptation
    with preserved self-similarity and domain-dissimilarity for person reidentification,” in IEEE Conf.
    Comput. Vis. Pattern Recognit., 2018, pp. 994–1003.
    [30] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional
    neural networks,” in Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
    [31] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture
    for computer vision,” in CVPR, 2016.
    [32] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE
    Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
    [33] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in
    NeurIPS, 2014.
    [34] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for
    quantum chemistry,” in ICML, 2017.
    [35] S. Takerkart, G. Auzias, B. Thirion, and L. Ralaivola, “Graph-based inter-subject pattern analysis
    of fmri data,” PloS One, 2014.
    [36] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks
    on graphs,” Computer Science, 2013.
    [37] T. Hamaguchi, H. Oiwa, M. Shimbo, and Y. Matsumoto, “Knowledge transfer for out-of-knowledgebase
    entities: A graph neural network approach,” arXiv preprint arXiv:1706.05674, 2017.
    [38] D. Beck, G. Haffari, and T. Cohn, “Graph-to-sequence learning using gated graph neural networks,”
    arXiv preprint arXiv:1806.09835, 2018.
    [39] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
    [40] C.-Y. Weng, W.-T. Chu, and J.-L. Wu, “Rolenet: Movie analysis from the perspective of social
    networks,” IEEE Trans. Multimedia, pp. 256–271, 2009.
    [41] C.-M. Tsai, L.-W. Kang, C.-W. Lin, and W. Lin, “Scene-based movie summarization via rolecommunity
    networks,” IEEE Trans. Circuits Syst. Video Technol., pp. 1927–1940, 2013.
    [42] E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, “Performance measures and a data
    set for multi-target, multi-camera tracking,” in European Conf. Comput. Vis. Workshop, 2016, pp.
    17–35.
    [43] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: A
    benchmark,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 1116–1124.
    [44] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycleconsistent
    adversarial networks,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 2223–
    2232.

    QR CODE