簡易檢索 / 詳目顯示

研究生: 林清淵
Lin, Ching-Yuan
論文名稱: 用於多重前景協同分割的無監督式卷積神經網路方法
Multiple Foreground Co-segmentation Using Unsupervised Convolutional Neural Networks
指導教授: 林嘉文
Lin, Chia-Wen
林彥宇
Lin, Yen-Yu
口試委員: 黃敬群
Huang, Ching-Chun
胡敏君
Hu, Min-Chun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 28
中文關鍵詞: 影像共分割多重前景協同分割多類別物件協同分割
外文關鍵詞: Object Co-segmentation, Multiple foreground co-segmentation, Multi- class co-segmentation, Multi-class image co-segmentation
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 多重前景的協同分割任務 (multiple foreground co-segmentation) 一直以來都被視為是一項具有挑戰性的研究。其目的是將一組內含多樣來自不同前景類別物體的圖像中所有相同類別的物件進行識別與分割。在本文中,我們藉由提出第一個無監督式且可被端到端訓練的 CNN 架構來解決這一個具挑戰性任務。這個 CNN 架構是由一個協同關注區域圖生成器 (co-attention map generator) 和一個特徵提取器 (feature extractor) 所組成。與大多傳統方法不同的是,我們所提出的方法可以更加有效地藉由端到端訓練的方式從所有未標記圖像集中學習。在損失函數的部分,我們利用了著重於減少提取特徵之間的類內差異和擴大類間距離的無監督式協同注意力損失函數 (unsupervised co-attention loss),以及遮罩損失函數 (mask loss) 來增進協同關注區域圖 (co-attention map) 的準確度,並提升最後的協同分割結果。與之前多重前景的協同分割任務方法相比,本方法在我們新蒐集的 MFC 資料集上具有良好的性能。


    Multiple foreground co-segmentation, which aims at co-segmenting multiple foreground objects from various classes across a set of images, has been considered a challenging task due to the high intra-class variations and inter-class similarity among the extracted features. In this paper, we address this challenging task by proposing the first unsupervised end-to-end trainable CNN-based framework, including a co-attention map generator and a feature extractor. Unlike some previous conventional approaches, the proposed method can learn from an entire set of unlabeled images simultaneously in an end-to-end manner, which is regarded as much more effective. We develop two unsupervised loss functions, the co-attention loss that pays attention to reducing the intra-class variations and enlarging the inter-class margins among the extracted features, and the mask loss that dedicates to emphasizing the complete foreground objects and refines the co-attention maps to avoid some false positives in the co-segmentation results. The proposed architecture performs favorably against the existing algorithms on our newly introduced MFC dataset.

    摘要 ...... i Abstract ...... ii Content ...... iii Chapter 1 Introduction ...... 1 Chapter 2 Related Work ...... 4 2.1 Object Co-segmentation ...... 4 2.2 Multiple Foreground Co-segmentation ...... 6 2.3 Unsupervised MFC Approaches ...... 7 Chapter 3 Proposed method ...... 9 3.1 Architecture ...... 9 3.2 Co-attention Loss ...... 11 3.3 Mask Loss ...... 13 Chapter 4 Experiments ...... 15 4.1 Datasets ...... 15 4.2 Evaluation Metrics ...... 16 4.3 Training and Implementation Details ...... 17 4.4 Comparison with Previous MFC Methods ...... 18 4.5 Effectiveness of the Additional Mask Loss ...... 23 Chapter 5 Conclusion ...... 24 References ...... 25

    [1] Pablo Arbel´aez, Jordi Pont-Tuset, Jonathan T Barron, Ferran Marques, and Jitendra Malik. Multiscale combinatorial grouping. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 328–335, 2014.
    [2] Haw-Shiuan Chang and Yu-Chiang Frank Wang. Optimizing the decomposition for multiple foreground co-segmentation. Computer Vision and Image Understanding, 141:18– 27, 2015.
    [3] Hong Chen, Yifei Huang, and Hideki Nakayama. Semantic aware attention based deep object co-segmentation. In Asian Conference on Computer Vision, pages 435–450. Springer, 2018.
    [4] Jifeng Dai, Ying Nian Wu, Jie Zhou, and Song-Chun Zhu. Cosegmentation and cosketch by unsupervised learning. In Proceedings of the IEEE international conference on computer vision, pages 1305–1312, 2013.
    [5] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In CVPR 2005.
    [6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR, 2009.
    [7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
    [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.Kuang-Jui Hsu, Yen-Yu Lin, and Yung-Yu Chuang. Coattention cnns for unsupervised object co-segmentation. In IJCAI, pages 748–756, 2018.
    [9] Koteswar Rao Jerripothula, Jianfei Cai, Fanman Meng, and Junsong Yuan. Automatic image co-segmentation using geometric mean saliency. In 2014 IEEE International Conference on Image Processing (ICIP), pages 3277–3281. IEEE, 2014.
    [10] Armand Joulin, Francis Bach, and Jean Ponce. Multi-class cosegmentation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 542–549. IEEE, 2012.
    [11] Gunhee Kim and Eric P Xing. On multiple foreground cosegmentation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 837–844. IEEE, 2012.
    [12] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    [13] Hongliang Li, Fanman Meng, Qingbo Wu, and Bing Luo. Unsupervised multiclass region cosegmentation via ensemble clustering and energy minimization. IEEE Transactions on Circuits and Systems for Video Technology, 24(5):789– 801, 2013.
    [14] Weihao Li, Omid Hosseini Jafari, and Carsten Rother. Deep object co-segmentation. In Asian Conference on Computer Vision, pages 638–653. Springer, 2018.
    [15] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
    [16] David G Lowe. Distinctive image features from scaleinvariant keypoints. International journal of computer vision, 60(2):91–110, 2004.
    [17] Z. Lu, H. Xu, and G. Liu. A survey of object cosegmentation. volume 7, pages 62875–62893. IEEE, 2019.
    [18] Tianyang Ma and Longin Jan Latecki. Graph transduction learning with connectivity constraints with application to multiple foreground cosegmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1955–1962, 2013.
    [19] Fanman Meng, Hongliang Li, Shuyuan Zhu, Bing Luo, Chao Huang, Bing Zeng, and Moncef Gabbouj. Constrained directed graph clustering and segmentation propagation for multiple foregrounds cosegmentation. IEEE Transactions on Circuits and Systems for Video Technology, 25(11):1735– 1748, 2015.
    [20] Rong Quan, Junwei Han, Dingwen Zhang, and Feiping Nie. Object co-segmentation via graph optimized-flexible manifold ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 687–695, 2016.
    [21] Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan, and Andrew Rabinovich. Training deep neural networks on noisy labels with bootstrapping. arXiv preprint arXiv:1412.6596, 2014.
    [22] Carsten Rother, Tom Minka, Andrew Blake, and Vladimir Kolmogorov. Cosegmentation of image pairs by histogram matching-incorporating a global constraint into mrfs. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 1, pages 993–1000. IEEE, 2006.
    [23] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014
    [24] Andrea Vedaldi and Karel Lenc. Matconvnet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM international conference on Multimedia, pages 689– 692. ACM, 2015.
    [25] Fan Wang, Qixing Huang, Maks Ovsjanikov, and Leonidas J Guibas. Unsupervised multi-class joint image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3142–3149, 2014.
    [26] Weichen Yang, Zhengxing Sun, Bo Li, Jiagao Hu, and Kewei Yang. Unsupervised multiple object cosegmentation via ensemble miml learning. In International Conference on Multimedia Modeling, pages 393–404. Springer, 2017.
    [27] Hongyuan Zhu, Jiangbo Lu, Jianfei Cai, Jianming Zheng, and Nadia M Thalmann. Multiple foreground recognition and cosegmentation: An object-oriented crf model with robust higher-order potentials. In IEEE Winter Conference on Applications of Computer Vision, pages 485–492. IEEE, 2014.

    QR CODE