簡易檢索 / 詳目顯示

研究生: 龎渝庭
Pang, Yu-Ting
論文名稱: 領域自適應語義分割的自引導對抗學習
Self-­guided Adversarial Learning for Domain Adaptive Semantic Segmentation
指導教授: 許秋婷
Hsu, Chiou-Ting
口試委員: 林嘉文
Lin, Chia-Wen
林彥宇
Lin, Yen-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 28
中文關鍵詞: 非監督式之領域自適應語意分割自引導對抗式學習
外文關鍵詞: Unsupervised domain adaptation, Semantic segmentation, Self-guided adversarial learning
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 非監督域適應被引入來將語意分割模型從具有標籤的合成資料泛化到不具有標籤的現實場景資料。儘管過去有許多方法被提出來減少合成資料與現實場景的域差異,然而,現實場景資料的分割結果品質仍然高度地不一致。在本文中,我們討論了兩個阻礙過去方法獲得令人滿意結果的主要問題,並提出一種新穎的自我引導對抗學習架構以提升現有分割模型的領域適應能力。首先,為了處理現實場景中不可預測的資料變異,我們通過選擇可靠的目標像素作為指導來引導其他像素的適應過程,開發出一種自我引導的對抗學習方法。其次,為了解決類別不平衡的問題,我們設計了一個獨立處理各類別的選擇策略,並將此想法與類別對抗學習相結合成一個框架。此外,我們也進一步將自我引導的對抗式學習導入至現有的自我蒸餾方法中,並進一步提升分割模型的結果品質。實驗結果顯示我們所提出的方法在幾個標準的數據集上,顯著改進了過去在非監督域適應的方法。


    Unsupervised domain adaptation has been introduced to generalize semantic segmentation models from labeled synthetic images to unlabeled real-world images. Although much effort was devoted to minimize the cross-domain gap, the segmentation results on real-world data remain highly unstable. In this thesis, we discuss two main issues which hinder previous methods from achieving satisfactory results and propose a novel self-guided adversarial learning to leverage the capability of domain adaptation. Firstly, to deal with the unpredictable data variation in the real-world domain, we develop a self-guided adversarial learning method by selecting reliable target pixels as guidance to lead the adaptation of the other pixels. Secondly, to address the class-imbalanced issue, we devise the selection strategy in each class independently and incorporate this idea with a class-level adversarial learning in a unified framework. Moreover, we demonstrate that incorporating the self-guided adversarial learning into self-distillation further boosts the performance. Experimental results show that the proposed method significantly improves the previous methods on several benchmark datasets.

    摘要---ii Abstract---iii Acknowledgements---iv 1 Introduction---1 2 Related work---5 2.1 Adversarial Training---5 2.2 Self-training---7 2.3 Self-distillation---7 3 Proposed method---8 3.1 Cross-domain Adaptation---9 3.2 Class-balanced Guide-pixel Selection---10 3.3 Self-guided Adaptation---11 3.4 Training Objectives---12 3.5 Self-distillation---13 4 Experiment---14 4.1 Datasets and Evaluation Metrics---14 4.2 Implementation Details---15 4.2.1 Data-preprocessing---15 4.2.2 Network architecture and Hyper-parameter settings---16 4.3 Quantitative results---17 4.3.1 Comparison with other methods---17 4.3.2 Ablation Study---18 4.3.3 Class-wise feature discrimination---19 4.4 Qualitative results---20 4.5 Discussion---22 5 Conclusion---24 References---25

    [1] H. Wang, T. Shen, W. Zhang, L.Y. Duan, and T. Mei, “Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation,” in European Conference on Computer Vision, pp. 642–659, Springer, 2020.
    [2] F. Pan, I. Shin, F. Rameau, S. Lee, and I. S. Kweon, “Unsupervised intra-domain adaptation for semantic segmentation through self-supervision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3764–3773, 2020.
    [3] Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    [4] S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground truth from computer games,” in European conference on computer vision, pp. 102–118, Springer, 2016.
    [5] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3234–3243, 2016.
    [6] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223, 2016.
    [7] J. Hoffman, E. Tzeng, T. Park, J.Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Darrell, “Cycada: Cycle-consistent adversarial domain adaptation,” in International conference on machine learning, pp. 1989–1998, PMLR, 2018.
    [8] Y. Li, L. Yuan, and N. Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6936–6945, 2019.
    [9] Y. Yang and S. Soatto, “Fda: Fourier domain adaptation for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4085–4095, 2020.
    [10] Y.H. Chen, W.Y. Chen, Y.T. Chen, B.C. Tsai, Y.C. Frank Wang, and M. Sun, “No more discrimination: Cross city adaptation of road scene segmenters,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1992–2001, 2017.
    [11] L. Du, J. Tan, H. Yang, J. Feng, X. Xue, Q. Zheng, X. Ye, and X. Zhang, “Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 982–991, 2019.
    [12] Y.H. Tsai, W.C. Hung, S. Schulter, K. Sohn, M.H. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481, 2018.
    [13] Y.H. Tsai, K. Sohn, S. Schulter, and M. Chandraker, “Domain adaptation for structured output via discriminative patch representations,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1456–1465, 2019.
    [14] J. Huang, S. Lu, D. Guan, and X. Zhang, “Contextual-relation consistent domain adaptation for semantic segmentation,” in European Conference on Computer Vision, pp. 705–722, Springer, 2020.
    [15] Y. Zou, Z. Yu, B. Vijaya Kumar, and J. Wang, “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training,” in Proceedings of the European conference on computer vision (ECCV), pp. 289–305, 2018.
    [16] Y. Zou, Z. Yu, X. Liu, B. Kumar, and J. Wang, “Confidence regularized self-training,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 5982–5991, 2019.
    [17] K. Mei, C. Zhu, J. Zou, and S. Zhang, “Instance adaptive self-training for unsupervised domain adaptation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pp. 415–430, Springer, 2020.
    [18] Y. Zhang, P. David, H. Foroosh, and B. Gong, “A curriculum domain adaptation approach to the semantic segmentation of urban scenes,” IEEE transactions on pattern analysis and machine intelligence, 2019.
    [19] Q. Lian, F. Lv, L. Duan, and B. Gong, “Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: A non-adversarial approach,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 6758–6767, 2019.
    [20] G. Li, G. Kang, W. Liu, Y. Wei, and Y. Yang, “Content-consistent matching for domain adaptive semantic segmentation,” in European Conference on Computer Vision, pp. 440–456, Springer, 2020.
    [21] T.H. Vu, H. Jain, M. Bucher, M. Cord, and P. Pérez, “Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2517–2526, 2019.
    [22] Y.C. Chen, Y.Y. Lin, M.H. Yang, and J.B. Huang, “Crdoco: Pixel-level domain transfer with cross-domain consistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1791–1800, 2019.
    [23] M. Kim, S. Joung, S. Kim, J. Park, I.J. Kim, and K. Sohn, “Cross-domain grouping and alignment for domain adaptive semantic segmentation,” arXiv preprint arXiv:2012.08226, 2020.
    [24] J. Hoffman, D. Wang, F. Yu, and T. Darrell, “Fcns in the wild: Pixel-level adversarial and constraint-based adaptation,” arXiv preprint arXiv:1612.02649, 2016.
    [25] S. Paul, Y.H. Tsai, S. Schulter, A. K. Roy-Chowdhury, and M. Chandraker, “Domain adaptive semantic segmentation using weak labels,” in European Conference on Computer Vision (ECCV), 2020.
    [26] N. Araslanov and S. Roth, “Self-supervised augmentation consistency for adapting semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15384–15394, 2021.
    [27] P. Zhang, B. Zhang, T. Zhang, D. Chen, Y. Wang, and F. Wen, “Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation,” arXiv preprint arXiv:2101.10979, vol. 2, p. 1, 2021.
    [28] I. Shin, S. Woo, F. Pan, and I. S. Kweon, “Two-phase pseudo label densification for self-training based domain adaptation,” in Computer Vision – ECCV 2020 (A. Vedaldi, H. Bischof, T. Brox, and J.M. Frahm, eds.), (Cham), pp. 532–548, Springer International Publishing, 2020.
    [29] H. Bagherinezhad, M. Horton, M. Rastegari, and A. Farhadi, “Label refinery: Improving imagenet classification through label progression,” arXiv preprint arXiv:1805.02641, 2018.
    [30] T. Furlanello, Z. Lipton, M. Tschannen, L. Itti, and A. Anandkumar, “Born again neural networks,” in International Conference on Machine Learning, pp. 1607–1616, PMLR, 2018.
    [31] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
    [32] M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes challenge: A retrospective,” International journal of computer vision, vol. 111, no. 1, pp. 98–136, 2015.
    [33] L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834–848, 2018.
    [34] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
    [35] Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W.m. Hwu, T. S. Huang, and H. Shi, “Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12635–12644, 2020.

    QR CODE