簡易檢索 / 詳目顯示

研究生: 郭源哲
Kuo, Yuan-Jhe
論文名稱: 提升穩健領域內與領域外泛化能力: 利用原型對齊與協作注意力模組之對比式學習
Towards Robust In-Domain and Out-of-Domain Generalization: Contrastive Learning with Prototype Alignment and Collaborative Attention
指導教授: 許秋婷
Hsu, Chiou-Ting
口試委員: 王聖智
Wang, Sheng-Jyh
陳煥宗
Chen, Hwann-Tzong
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 111
語文別: 英文
論文頁數: 26
中文關鍵詞: 領域泛化對比式學習嘈雜標籤度量學習
外文關鍵詞: Domain generalization, Contrastive learning, Noisy labels, Metric learning
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 領域泛化的目標是藉由在多個源域中學習出能夠泛化於看不見的目標域模型。
    在假設目標域的分布和源域不相同的前提下,先前的方法大多解決了領域外泛化
    的問題,卻很少關注在源域的領域內泛化表現。而我們認為由於目標域是無法預
    見地,且其分布可能會與源域的分布相當接近,因此領域內及領域外的泛化能力
    是同樣重要的。另外,當源域出現不一致或是嘈雜的真實標籤時,模型的穩健性
    也是重要的議題。因此,在本文中,我們提出了利用原型對齊與協作注意力模組
    的對比式學習架構來解決圖片分類在穩健領域內與領域外之領域泛化的問題。首
    先,我們設計一個基於距離的對比式學習來分開模糊的類別到一定距離以提升領
    域外的表現。再者,我們提出利用原型對齊來將每個類別的特徵表徵對齊到對應
    的原型上以提升領域內的表現。最後,我們提出一個全新的協作注意力模組來運
    用正向與反向學習的好處以提升模型的穩健程度。實驗結果在兩個基準下顯示我
    們的方法不僅在領域內的表現相當有競爭力,在領域外的表現以及具嘈雜標籤的
    情況下更是優於之前的方法。


    Domain generalization focuses on generalizing a model learned from multiple source domains to unseen target domains. Assuming the target domains distribute differently from the source domains, most previous methods address the out-of-domain generalization issue but slightly concern the in-domain performance on the source domains. Because the target domains are unseen and may distribute similarly with the source domains, we believe both the in-domain and out-of-domain performances are equally important. The model robustness also raises concerns when there exist inconsistent or noisy ground truth labels in the source domains. Therefore, in this thesis, we propose a contrastive learning framework with prototype alignment and collaborative attention to address the robust in-domain and out-of-domain generalization issue for image classification. We first design a margin-based contrastive learning to boost the out-of-domain performance by pushing the ambiguous classes apart by at least a margin. Next, we propose using prototype alignment to support the in-domain performance by aligning the latent feature representation of each class to the corresponding class prototype. Finally, we propose a novel collaborative attention method by leveraging the strength from both positive and negative learnings to enhance the model robustness. Experimental results on two benchmarks show that our method achieves competitive in-domain performance and outperforms previous methods in the out-of-domain and noisy label scenarios.

    Contents 摘要i Abstract ii Acknowledgements 1 Introduction 1 2 Related Work 4 2.1 Domain Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Noise-Label Representation Learning . . . . . . . . . . . . . . . . . . . . . . 6 3 Method 7 3.1 Margin-Based Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Prototype Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Collaborative Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.4 Total Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Experiments 15 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Datasets and Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3.1 Effectiveness of Collaborative Attention . . . . . . . . . . . . . . . . . 17 4.3.2 Effectiveness of Margin-Based Contrastive Learning . . . . . . . . . . 18 4.3.3 Effectiveness of Prototype Alignment . . . . . . . . . . . . . . . . . . 18 4.3.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4.1 In-Domain Performance . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4.2 Out-of-Domain Performance . . . . . . . . . . . . . . . . . . . . . . . 20 4.4.3 Model Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Conclusion 22 References 23

    References
    [1] V. Vapnik, “Statistical learning theory new york,” NY: Wiley, vol. 1, no. 2, p. 3, 1998.
    [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25,2012.
    [3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate
    object detection and semantic segmentation,” in Proceedings of the IEEE conference on
    computer vision and pattern recognition, pp. 580–587, 2014.
    [4] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic
    image segmentation with deep convolutional nets, atrous convolution, and fully connected
    crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4,
    pp. 834–848, 2017.
    [5] H. Li, S. J. Pan, S. Wang, and A. C. Kot, “Domain generalization with adversarial feature
    learning,” in Proceedings of the IEEE conference on computer vision and pattern
    recognition, pp. 5400–5409, 2018.
    [6] B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,”
    in European conference on computer vision, pp. 443–450, Springer, 2016.
    [7] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand,
    and V. Lempitsky, “Domain-adversarial training of neural networks,” The journal
    of machine learning research, vol. 17, no. 1, pp. 2096–2030, 2016.
    [8] M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, and W. Zhang, “Adversarial domain
    adaptation with domain mixup,” in Proceedings of the AAAI Conference on Artificial Intelligence,
    vol. 34, pp. 6502–6509, 2020.
    [9] K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain generalization with mixstyle,” arXiv
    preprint arXiv:2104.02008, 2021.
    [10] F. M. Carlucci, A. D’Innocente, S. Bucci, B. Caputo, and T. Tommasi, “Domain generalization
    by solving jigsaw puzzles,” in CVPR, 2019.
    [11] Z. Huang, H. Wang, E. P. Xing, and D. Huang, “Self-challenging improves cross-domain
    generalization,” in European Conference on Computer Vision, pp. 124–140, Springer,
    2020.
    [12] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy
    to train strong classifiers with localizable features,” in International Conference on
    Computer Vision (ICCV), 2019.
    23
    [13] Y. Li, X. Tian, M. Gong, Y. Liu, T. Liu, K. Zhang, and D. Tao, “Deep domain generalization
    via conditional invariant adversarial networks,” in Proceedings of the European
    Conference on Computer Vision (ECCV), pp. 624–639, 2018.
    [14] D. Li, Y. Yang, Y.-Z. Song, and T. Hospedales, “Learning to generalize: Meta-learning for
    domain generalization,” in Proceedings of the AAAI conference on artificial intelligence,
    vol. 32, 2018.
    [15] M. Zhang, H. Marklund, N. Dhawan, A. Gupta, S. Levine, and C. Finn, “Adaptive risk
    minimization: Learning to adapt to domain shift,” Advances in Neural Information Processing
    Systems, vol. 34, pp. 23664–23678, 2021.
    [16] P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for
    efficiently improving generalization,” arXiv preprint arXiv:2010.01412, 2020.
    [17] G. Blanchard, A. A. Deshmukh, Ü. Dogan, G. Lee, and C. Scott, “Domain generalization
    by marginal transfer learning,” The Journal of Machine Learning Research, vol. 22, no. 1,
    pp. 46–100, 2021.
    [18] Y. Balaji, S. Sankaranarayanan, and R. Chellappa, “Metareg: Towards domain generalization
    using meta-regularization,” Advances in neural information processing systems,
    vol. 31, 2018.
    [19] M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, “Invariant risk minimization,”
    arXiv preprint arXiv:1907.02893, 2019.
    [20] S. Sagawa, P. W. Koh, T. B. Hashimoto, and P. Liang, “Distributionally robust neural
    networks for group shifts: On the importance of regularization for worst-case generalization,”
    arXiv preprint arXiv:1911.08731, 2019.
    [21] D. Krueger, E. Caballero, J.-H. Jacobsen, A. Zhang, J. Binas, D. Zhang, R. Le Priol, and
    A. Courville, “Out-of-distribution generalization via risk extrapolation (rex),” in International
    Conference on Machine Learning, pp. 5815–5826, PMLR, 2021.
    [22] L. Li, K. Gao, J. Cao, Z. Huang, Y. Weng, X. Mi, Z. Yu, X. Li, and B. Xia, “Progressive
    domain expansion network for single domain generalization,” in Proceedings of the
    IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 224–233, 2021.
    [23] O. Nuriel, S. Benaim, and L. Wolf, “Permuted adain: Reducing the bias towards global
    statistics in image classification,” in Proceedings of the IEEE/CVF Conference on Computer
    Vision and Pattern Recognition, pp. 9482–9491, 2021.
    [24] Y. Du, X. Zhen, L. Shao, and C. G. Snoek, “Metanorm: Learning to normalize few-shot
    batches across domains,” in International Conference on Learning Representations, 2020.
    [25] J. Cha, S. Chun, K. Lee, H.-C. Cho, S. Park, Y. Lee, and S. Park, “Swad: Domain generalization
    by seeking flat minima,” Advances in Neural Information Processing Systems,
    vol. 34, pp. 22405–22418, 2021.
    [26] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised
    visual representation learning,” in Proceedings of the IEEE/CVF conference on computer
    vision and pattern recognition, pp. 9729–9738, 2020.
    24
    [27] J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch,
    B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., “Bootstrap your own latent-a new
    approach to self-supervised learning,” Advances in neural information processing systems,
    vol. 33, pp. 21271–21284, 2020.
    [28] X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings
    of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–
    15758, 2021.
    [29] D. Kim, Y. Yoo, S. Park, J. Kim, and J. Lee, “Selfreg: Self-supervised contrastive regularization
    for domain generalization,” in Proceedings of the IEEE/CVF International Conference
    on Computer Vision, pp. 9619–9628, 2021.
    [30] X. Yao, Y. Bai, X. Zhang, Y. Zhang, Q. Sun, R. Chen, R. Li, and B. Yu, “Pcl: Proxybased
    contrastive learning for domain generalization,” in Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition, pp. 7097–7107, 2022.
    [31] H. Song, M. Kim, D. Park, Y. Shin, and J.-G. Lee, “Learning from noisy labels with
    deep neural networks: A survey,” IEEE Transactions on Neural Networks and Learning
    Systems, 2022.
    [32] Y. Kim, J. Yim, J. Yun, and J. Kim, “Nlnl: Negative learning for noisy labels,” in Proceedings
    of the IEEE/CVF International Conference on Computer Vision, pp. 101–110,
    2019.
    [33] S. Motiian, M. Piccirilli, D. A. Adjeroh, and G. Doretto, “Unified deep supervised domain
    adaptation and generalization,” in IEEE International Conference on Computer Vision
    (ICCV), 2017.
    [34] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,”
    in Proceedings of the European conference on computer vision (ECCV), pp. 3–19,
    2018.
    [35] D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain
    generalization,” in Proceedings of the IEEE international conference on computer vision,
    pp. 5542–5550, 2017.
    [36] C. Fang, Y. Xu, and D. N. Rockmore, “Unbiased metric learning: On the utilization of
    multiple datasets and web images for softening bias,” in Proceedings of the IEEE International
    Conference on Computer Vision, pp. 1657–1664, 2013.
    [37] I. Gulrajani and D. Lopez-Paz, “In search of lost domain generalization,” in International
    Conference on Learning Representations, 2020.
    [38] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in
    Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–
    778, 2016.
    [39] P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights
    leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018.
    25
    [40] D. Arpit, H. Wang, Y. Zhou, and C. Xiong, “Ensemble of averages: Improving
    model selection and boosting performance in domain generalization,” arXiv preprint
    arXiv:2110.10832, 2021.
    [41] H. Nam, H. Lee, J. Park, W. Yoon, and D. Yoo, “Reducing domain gap by reducing
    style bias,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
    Recognition, pp. 8690–8699, 2021.
    [42] J. Cha, K. Lee, S. Park, and S. Chun, “Domain generalization by mutual-information regularization
    with pre-trained models,” arXiv preprint arXiv:2203.10789, 2022.
    26

    QR CODE