研究生: |
柯子逸 Ke, Zi-Yi |
---|---|
論文名稱: |
基於生成自我引導的密集標籤之弱監督語義分割 Generating Self-Guided Dense Annotations for Weakly Supervised Semantic Segmentation |
指導教授: |
許秋婷
Hsu, Chiou-Ting |
口試委員: |
簡仁宗
Chien, Jen-Tzung 陳煥宗 Chien, Jen-Tzung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 英文 |
論文頁數: | 30 |
中文關鍵詞: | 弱監督 、語義分割 、自我引導 |
外文關鍵詞: | Weakly Supervised, Semantic Segmentation, Self-Guided |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
相較於全監督設定,使用影像級標籤學習語義分割模型是非常具有挑戰性的。由於不知道確切的像素─標籤對應關係,大多數以弱監督設定學習的方法都依賴額外的模型去推斷偽像素級標籤,並用來訓練學習語義分割模型。在本篇論文中,我們的目標是在不使用額外的模型的前提下,發展出一個單一的類神經網路,去訓練語義分割模型。我們提出一個創新的自我引導策略,充分地去利用學習到的所有層級的特徵,以逐步的生成密集偽標籤。首先,我們利用高層級的特徵作為各類別的定位圖,以約略的定位每個類別。接著,我們提出相似性引導的方法,促進每個定位圖與對應到的中間級的特徵有一致的表現。第三,我們採用訓練影像本身作為引導,並使用自我引導的細化修補加以轉移影像固有的結構至定位圖中。最終,我們可以由這些定位圖中獲得偽像素級標籤,並用這些偽標籤作為基本事實來訓練語義分割模型。我們提出的自我引導策略是一個統一的框架,建立在一個單一的網路上,並且,整個訓練過程交替於更新特徵表示與修飾定位圖之間。在PASCAL VOC 2012 segmentation benchmark上的實驗結果顯示了我們的方法優於其他使用同樣設定的弱監督方法。
Learning semantic segmentation models under image-level supervision is far more challenging than under fully supervised setting. Without knowing the exact pixel-label correspondence, most weakly-supervised methods rely on external models to infer pseudo pixel-level labels for training semantic segmentation models. In this thesis, we aim to develop a single neural network without resorting to any external models. We propose a novel self-guided strategy to fully utilize features learned across multiple levels to progressively generate the dense pseudo labels. First, we use high-level features as class-specific localization maps to roughly locate the classes. Next, we propose an affinity-guided method to encourage each localization map to be consistent with their intermediate level features. Third, we adopt the training image itself as guidance and propose a self-guided refinement to further transfer the image's inherent structure into the maps. Finally, we derive pseudo pixel-level labels from these localization maps and use the pseudo labels as ground truth to train the semantic segmentation model. Our proposed self-guided strategy is a unified framework, which is built on a single network and alternatively updates the feature representation and refines localization maps during the training procedure. Experimental results on PASCAL VOC 2012 segmentation benchmark demonstrate that our method outperforms other weakly-supervised methods under the same setting.
[1] A. Chaudhry, P.K. Dokania and P.H.S. Torr, “Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation,’’ In BMVC, 2017.
[2] K. He, J. Sun and X. Tang, “Guided Image Filtering,’’ In TPAMI, 2013.
[3] J. Dai, K. He and J. Sun, “Boxsup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation,’’ In ICCV, 2015.
[4] A. Khoreva, R. Benenson, J. Hosang, M. Hein and B. Schiele, “Simple Does It: Weakly Supervised Instance and Semantic Segmentation,” In CVPR, 2017.
[5] D. Lin, J. Dai, J. Jia, K. He and J. Sun, “ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation,’’ In CVPR, 2016.
[6] A. Bearman, O. Russakovsky, V. Ferrari and F.-F. Li, “What’s the Point: Semantic Segmentation with Point Supervision,” In ECCV, 2016.
[7] A. Roy and S. Todorovic, “Combining Bottom-up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation,” In CVPR, 2017.
[8] X. Qi, Z. Liu, J. Shi, H. Zhao and J. Jia, “Augmented Feedback in Semantic Segmentation under Image Level Supervision,” In ECCV, 2016.
[9] Y. Wei, X. Liang, Y. Chen, Z. Jie, Y. Xiao, Y. Zhao and S. Yan, “Learning to Segment with Image-level Annotations,” In PR, 2016.
[10] Y. Wei, X. Liang, Y. Chen, X. Shen, M.-M. Cheng, J. Feng, Y. Zhao and S. Yan, “STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation,” In TPAMI, 2016.
[11] W. Shimoda and K. Yanai, “Distinct Class-specific Saliency Maps for Weakly Supervised Semantic Segmentation,” In ECCV, 2016.
[12] F. Saleh, M.S. Akbarian, M. Salzmann, L. Petersson, S. Gould and J.M. Alvarez, “Built-in Foreground/background Prior for Weakly-supervised Semantic Segmentation,” In ECCV, 2016.
[13] F. Saleh, M.S. Akbarian, M. Salzmann, L. Petersson, J.M. Alvarez and S. Gould, “Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation,” In TPAMI, 2017.
[14] A. Kolesnikov and C.H. Lampert, “Seed, Expand and Constrain: Three Principles for Weakly-supervised Image Segmentation,” In ECCV, 2016.
[15] Y. Wei, J. Feng, X. Liang, M.-M. Cheng, Y. Zhao and S. Yan, “Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach,” In CVPR, 2017.
[16] G. Papandreou, L.C. Chen, K. Murphy and A.L. Yuille, “Weakly- and Semi- Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation,” In ICCV, 2015.
[17] D. Pathak, E. Shelhamer, J. Long and T. Darrell, “Fully Convolutional Multi-Class Multiple Instance Learning,” In ICLR Workshop, 2015.
[18] P.O. Pinheiro and R. Collobert, “From image-level to pixel-level labeling with convolutional networks,” In CVPR, 2015.
[19] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A.L. Yuille, “Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs,” In TPAMI, 2016.
[20] K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition,” In CVPR, 2016.
[21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and F.-F. Li, “Imagenet: A large-scale hierarchical image database,” In CVPR, 2009.
[22] M. Everingham, S.M.A. Eslami, L.V. Gool, C.K.I. Williams, J. Winn and A. Zisserman, “The pascal visual object classes challenge: A retrospective,” In IJCV, 2015.
[23] P. Krähenbühl and V. Koltun, “Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials,” In NIPS, 2011.
[24] J. Long, E. Shelhamer and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” In CVPR, 2015.
[25] H. Noh, S. Hong and B. Han, “Learning Deconvolution Network for Semantic Segmentation,” In ICCV, 2015.
[26] X. Li, Z. Liu, P. Luo, C.C. Loy and X. Tang, “Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade,” In CVPR, 2018.
[27] A. Khoreva, R. Benenson, J. Hosang, M. Hein and B. Schiele, “Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network,” In CVPR, 2016.
[28] S.J. Oh, R. Benenson, A. Khoreva, Z. Akata, M. Fritz and B. Schiele, “Exploiting saliency for object segmentation from image level labels,” In CVPR, 2017
[29] P.O. Pinheiro, R. Collobert and P. Dollar, “Learning to Segment Object Candidates,” In NIPS, 2015.
[30] P.O. Pinheiro, T.-Y. Lin, R. Collobert and P. Dollar, “Learning to Refine Object Segments,” In ECCV, 2016.
[31] A. Arnab and P.H.S. Torr, “Pixelwise Instance Segmentation with a Dynamically Instantiated Network,” In CVPR, 2017.
[32] K. He, G. Gkioxari, P. Dollar and R. Girshick, “Mask R-CNN,” In ICCV, 2017.
[33] L. Lovsz, “Random walks on graphs: A survey,” 1993.
[34] J. Ahn and S. Kwak, “Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation,” In CVPR, 2018.
[35] H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi and A. Agrawal, “Context Encoding for Semantic Segmentation,” In CVPR, 2018.
[36] P. Bilinski and V. Prisacariu, “Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation,” In CVPR, 2018.
[37] M. Yang, K. Yu, C. Zhang, Z. Li and K. Yang, “DenseASPP for Semantic Segmentation in Street Scenes,” In CVPR, 2018.
[38] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu and N. Sang, “Learning a Discriminative Feature Network for Semantic Segmentation,” In CVPR, 2018.
[39] X. Wang, S. You, X. Li and H. Ma, “Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features,” In CVPR, 2018.
[40] Y. Wei, H. Xiao, H. Shi, Z. Jie, J. Feng and T.S. Huang, “Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation,” In CVPR, 2018.
[41] Z. Huang, X. Wang, J. Wang, W. Liu and J. Wang, “Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing,” In CVPR, 2018.
[42] B. Hariharan, P. Arbelaez, L. Bourdev, S. Maji and J. Malik, “Semantic contours from inverse detectors,” In ICCV, 2011.