基於單邊元三元組損失函數的多任務架構應用於泛化人臉防偽辨識

簡易檢索 / 詳目顯示

回結果列表

研究生：	莊筑鈞 Chuang, Chu-Chun
論文名稱：	基於單邊元三元組損失函數的多任務架構應用於泛化人臉防偽辨識 Multi-Task Framework for Generalized Face Anti-Spoofing with One-Side Meta Triplet Loss
指導教授：	賴尚宏 Lai, Shang-Hong
口試委員:	林嘉文 Lin, Chia-Wen 許秋婷 Hsu, Chiu-Ting 黃思皓 Huang, Szu-Hao
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	40
中文關鍵詞：	電腦視覺、深度學習、人臉防偽辨識、多任務、元學習、域名泛化
外文關鍵詞：	Computer vision, Deep Learning, Face anti-spoofing, Multi-task, Meta learning, Domain generalization
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

由於偽冒攻擊的變化日漸增加，模型的泛化對於人臉防偽辨識成為一項不可
或缺的挑戰，然而先前許多提出的方法往往無法在泛化上表現得很好，本論
文基於兩個角度來提升人臉防偽辨識的泛化能力，首先，在網路中使用人臉
解析的資訊，讓網路能專注於臉部區域以及理解不同臉部區域的分布；第二
點，使用單邊元三元組損失函數與元學習的過程進行合作。本論文提出一個
新穎的多任務架構應用於泛化人臉防偽辨識，方法中包括三個任務: 深度預
估、人臉解析、欺騙分類。藉由在人臉解析以及深度預測的任務中做像素級
的監督，讓學到的特徵有正規化的效果，能夠更準確的區分出遭受攻擊的
臉。另外，我們提出的單邊元三元組損失函數使用兩階段加大的邊界值，與
元學習中模擬域名轉移的過程互相結合，增進模型泛化的能力。本論文提出
的架構包括一個特徵提取器、一個深度預估器、一個基於U-net 的人臉解析
器、以及一個元學習器負責元學習和分類器。而本文提出的基於U-net 的人
臉解析器包含一個用來預估臉部語義照片的U-net，和一個基於注意力模型
的連接，用來整合不同維度中的臉部語義資訊。本文中使用四個公開資料集
來做測試泛化能力的實驗，證實我們提出的多任務架構以及訓練方法可以比
先前其他方法在泛化的能力上表現得更好，面對沒看過的資料也有相當優越
的結果，在一些人臉防偽辨識的域泛化基準實驗中，我們的方法相較於所
參考的方法，AUC 進步超過6%，而比起過去的方法，HTER 也有相當的進
步。

Due to increasing variations of presentation attacks, model generalization becomes
an essential challenge for face anti-spoofing. Many previous works could not perform
well in generalization. This paper improves the generalization ability of face
anti-spoofing with two aspects. First, employing the face parsing information encourages
the network to focus on face regions and realizes distributions between
different face parts. Second, one-side triplet loss is adopted into the network to cooperate
with the meta learning process. This paper proposes a novel multi-task face
anti-spoofing framework that contains three tasks: depth estimation, face parsing,
and live/spoof classification. With the pixel-wise supervision from the face parsing
and depth estimation tasks, the regularized features can better distinguish spoof
faces. While simulating domain shift with meta learning techniques, the proposed
one-side triplet loss can further improve the generalization capability by a two-stage
margin setting. Our framework consists of a feature extractor, a depth estimator, a
U-net based face parsing module, and a meta learner for conducting meta learning
and classification. The proposed U-net based face parsing module contains a U-net
for predicting semantic face image and an attention-based skip connection for aggregating
the semantic information of different channels. Extensive experiments on
four public datasets demonstrate that the proposed framework and training strategies
are more effective than previous works for model generalization to unseen domains.
The AUCs are improved by over 6% compared to the baseline for some experiments
on domain generalization benchmark for face anti-spoofing, and the HTER is also
significantly improved over the previous methods.

Introduction 1
1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Related Work 5
1 Face Anti-spoofing . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Temporal-based Methods . . . . . . . . . . . . . . . . . . . 5
1.2 Appearance-based Methods . . . . . . . . . . . . . . . . . 6
2 Domain Generalization . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Meta Learning for Domain Generalization . . . . . . . . . . 8
Proposed Method 9
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Multi-Task Meta Learning . . . . . . . . . . . . . . . . . . . . . . 10
3 U-net Based Face Parsing Module . . . . . . . . . . . . . . . . . . 12
3.1 Face Parsing U-net . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Attention-Based Skip Connection . . . . . . . . . . . . . . 13
4 One-Side Triplet loss . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.1 Classification Loss . . . . . . . . . . . . . . . . . . . . . . 15
5.2 One-Side Triplet Loss . . . . . . . . . . . . . . . . . . . . 16
5.3 Segmentation Loss . . . . . . . . . . . . . . . . . . . . . . 17
5.4 Depth Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.5 Overall Loss . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Experiments 23
1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.2 Implementation Details . . . . . . . . . . . . . . . . . . . . 25
1.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . 26
2 Experimental Comparisons . . . . . . . . . . . . . . . . . . . . . . 27
3 Face Parsing Results . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1 U-net Based Face Parsing Module . . . . . . . . . . . . . . 29
4.2 One-Side Triplet Loss with Meta learning . . . . . . . . . . 30
5 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1 Grad-CAM Visualization . . . . . . . . . . . . . . . . . . . 30
5.2 t-SNE Visualization . . . . . . . . . . . . . . . . . . . . . 31
5.3 Effect of Attention-Based Skip Connection for Face Parsing 32
Conclusions 37
References 38
                                

[1] Atoum, Y., Liu, Y., Jourabloo, A., and Liu, X. Face anti-spoofing using patch
and depth-based cnns. In In Proceeding of International Joint Conference on
Biometrics (2017).
[2] Balaji, Y., Sankaranarayanan, S., and Chellappa, R. Metareg: Towards domain
generalization using meta-regularization. In Advances in Neural Information
Processing Systems (2018), pp. 998–1008.
[3] Boulkenafet, Z., Komulainen, J., and Hadid, A. Face spoofing detection using
colour texture analysis. IEEE Transactions on Information Forensics and
Security (2016).
[4] Boulkenafet, Z., Komulainen, J., Li, L., Feng, X., and Hadid, A. Oulu-npu: A
mobile face presentation attack database with real-world variations.
[5] Chingovska, I., Anjos, A., and Marcel, S. On the effectiveness of local binary
patterns in face anti-spoofing. In 2012 BIOSIG - Proceedings of the International
Conference of Biometrics Special Interest Group (BIOSIG) (2012).
[6] Deng, J., Guo, J., Ververas, E., Kotsia, I., and Zafeiriou, S. Retinaface: Singleshot
multi-level face localisation in the wild. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020).
[7] Feng, Y., Wu, F., Shao, X., Wang, Y., and Zhou, X. Joint 3d face reconstruction
and dense alignment with position map regression network. In Proceedings of
the European Conference on Computer Vision (ECCV) (2018).
[8] Finn, C., Abbeel, P., and Levine, S. Model-agnostic meta-learning for fast
adaptation of deep networks. In Proceedings of the 34th International Conference
on Machine Learning (06–11 Aug 2017), Proceedings of Machine Learning
Research, pp. 1126–1135.
[9] Freitas Pereira, T. d., Komulainen, J., Anjos, A., De Martino, J. M., Hadid, A.,
Pietikäinen, M., and Marcel, S. Face liveness detection using dynamic texture.
EURASIP Journal on Image and Video Processing (2014), 2.
[10] Ghifary, M., Kleijn, W. B., Zhang, M., and Balduzzi, D. Domain generalization
for object recognition with multi-task autoencoders. In 2015 IEEE
International Conference on Computer Vision (ICCV) (2015).
[11] Guo, J., Zhu, X., Zhao, C., Cao, D., Lei, Z., and Li, S. Z. Learning meta face
recognition in unseen domains. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR) (2020).
[12] He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition.
In 2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR) (2016).
[13] Jia, Y., Zhang, J., Shan, S., and Chen, X. Single-side domain generalization for
face anti-spoofing. In Proc. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) (2020).
[14] Komulainen, J., Hadid, A., and Pietikäinen, M. Face spoofing detection from
single images using micro-texture analysis.
[15] Li, D., Yang, Y., Song, Y.-Z., and Hospedales, T. Learning to generalize: Metalearning
for domain generalization, 2018.
[16] Li, H., Pan, S. J., Wang, S., and Kot, A. C. Domain generalization with adversarial
feature learning. In 2018 IEEE/CVF Conference on Computer Vision
and Pattern Recognition (2018).
[17] Liu, Y., Jourabloo, A., and Liu, X. Learning deep models for face antispoofing:
Binary or auxiliary supervision. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR) (2018).
[18] Motiian, S., Piccirilli, M., Adjeroh, D. A., and Doretto, G. Unified deep supervised
domain adaptation and generalization. In Proceedings of the IEEE
International Conference on Computer Vision (ICCV) (2017).
[19] Nichol, A., Achiam, J., and Schulman, J. On first-order meta-learning algorithms,
2018.
[20] Pérez-Cabo, D., Jiménez-Cabello, D., Costa-Pazo, A., and López-Sastre,
R. J. Deep anomaly detection for generalized face anti-spoofing. In 2019
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
(CVPRW) (2019).
[21] Saha, S., Xu, W., Kanakis, M., Georgoulis, S., Chen, Y., Paudel, D. P., and
Van Gool, L. Domain agnostic feature learning for image and video based
face anti-spoofing. In 2020 IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops (CVPRW) (2020).
[22] Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra,
D. Grad-cam: Visual explanations from deep networks via gradient-based localization.
In 2017 IEEE International Conference on Computer Vision (ICCV)
(2017).
[23] Shao, R., Lan, X., Li, J., and Yuen, P. C. Multi-adversarial discriminative
deep domain generalization for face presentation attack detection. In The IEEE
Conference on Computer Vision and Pattern Recognition (CVPR) (2019).
[24] Shao, R., Lan, X., and Yuen, P. C. Regularized fine-grained meta face antispoofing.
In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI)
(2020).
[25] van der Maaten, L., and Hinton, G. Visualizing data using t-sne. Journal of
Machine Learning Research (2008).
[26] Wang, G., Han, H., Shan, S., and Chen, X. Cross-domain face presentation
attack detection via multi-domain disentangled representation learning.
[27] Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. Eca-net: Efficient
channel attention for deep convolutional neural networks. In 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR) (2020).
[28] Wang, Z., Yu, Z., Zhao, C., Zhu, X., Qin, Y., Zhou, Q., Zhou, F., and Lei,
Z. Deep spatial gradient and temporal depth learning for face anti-spoofing.
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR) (June 2020).
[29] Wen, D., Han, H., and Jain, A. K. Face spoof detection with image distortion
analysis. IEEE Transactions on Information Forensics and Security (2015).
[30] Xu, Z., Li, S., and Deng, W. Learning temporal features using lstm-cnn architecture
for face anti-spoofing. 2015 3rd IAPR Asian Conference on Pattern
Recognition (ACPR) (2015).
[31] Yang, J., Lei, Z., and Li, S. Z. Learn convolutional neural network for face
anti-spoofing, 2014.
[32] Yang, X., Luo, W., Bao, L., Gao, Y., Gong, D., Zheng, S., Li, Z., and Liu,
W. Face anti-spoofing: Model matters, so does data. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
(June 2019).
[33] Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. Bisenet: Bilateral
segmentation network for real-time semantic segmentation. In Proceedings of
the European conference on computer vision (ECCV) (2018).
[34] Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. Bisenet: Bilateral
segmentation network for real-time semantic segmentation. In Proceedings of
the European Conference on Computer Vision (ECCV) (September 2018).
[35] Yu, Z., Li, X., Niu, X., Shi, J., and Zhao, G. Face anti-spoofing with human
material perception, 07 2020.
[36] Yu, Z., Qin, Y., Li, X., Wang, Z., Zhao, C., Lei, Z., and Zhao, G. Multi-modal
face anti-spoofing based on central difference networks. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Workshops (June 2020).
[37] Yu, Z., Wan, J., Qin, Y., Li, X., Li, S. Z., and Zhao, G. Nas-fas: Static-dynamic
central difference network search for face anti-spoofing. IEEE Transactions
on Pattern Analysis and Machine Intelligence (2020).
[38] Zhang, Z., Yan, J., Liu, S., Lei, Z., Yi, D., and Li, S. Z. A face antispoofing
database with diverse attacks. In 2012 5th IAPR International Conference on
Biometrics (ICB) (2012).
[39] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. Learning
deep features for discriminative localization. In 2016 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR) (2016), pp. 2921–2929.

簡易檢索 / 詳目顯示

相關論文