簡易檢索 / 詳目顯示

研究生: 黃培凱
Huang,Pei-Kai
論文名稱: 二類與單類面部防偽: 透過增強視覺轉換器與生成式特徵學習
Two-Class and One-Class Face Anti-Spoofing: Enhancing Vision Transformer and Generative Feature Learning
指導教授: 許秋婷
HSU, CHIU-TING
口試委員: 陳煥宗
CHEN, HWANN-TZONG
賴尚宏
LAI, SHANG-HONG
劉庭祿
Liu, Tyng-Luh
鄭文皇
Cheng, Wen-Huang
林彥宇
Lin, Yen-Yu
許志仲
Hsu, Chih-Chung
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 138
中文關鍵詞: 人臉防偽二分類識別視覺轉換器互補通道資訊可學習局部 描述子單分類識別生成式特徵學習
外文關鍵詞: face anti-spoofing, two-class classification, vision transformer, complementary channel information, learnable local descriptor, one-class classification, generative feature learning
相關次數: 點閱:2下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人臉識別與身份識別系統已深入我們的日常生活,並在多種應用中提供快 速便利,例如手機解鎖、網路銀行身份驗證以及犯罪嫌疑人識別。然而,人臉 識別技術的廣泛應用也帶來了潛在的安全風險,因此需要專門技術來確保應用 的安全性。為此,人臉防偽(Face Anti-Spoofing, FAS)技術應運而生,其目的 是區分真實人臉與偽造人臉,以對抗人臉展示攻擊(Presentation Attacks),如 列印攻擊,將人臉印刷在紙張上與重播攻擊,在電子設備上播放人臉影片,以 確保人臉識別與身份識別系統的安全性。本論文聚焦於不同的FAS設定,並將 論文分為二分類的FAS與單分類的FAS兩大部分。具體而言,針對二分類人臉 防偽(Two-class FAS),本論文探討了視覺轉換器(Vision Transformers, ViTs) 的應用。實驗結果顯示,不論是結合互補通道資訊(Complementary channel information)或採用混合式架構(即結合卷積神經網路與視覺轉換器)都能有效 提升ViT 在人臉防偽任務上的表現。針對單分類人臉防偽(One-class FAS),我 們發現採用生成式特徵學習(Generative feature learning)能有效緩解缺乏假臉訓 練資料的問題。此外,結合更多輔助資訊,並根據現實場景中已知假臉攻擊的 特徵來模擬潛在攻擊方式,也能進一步提升單分類臉部防偽模型的效能。在接 下來的段落中,我們將說明本論文的整體架構與內容概覽。

    首先,在第1章,我們介紹FAS的背景知識,包括其基本概念、常見挑戰、基準數據集以及評估指標。接著,在第2章,我們回顧二分類與單分類FAS的最新技術方法以及介紹視覺轉換器跟生成式特徵學習的背景。在二分類FAS部分,我們專注於解決二分類問題,其中訓練過程同時使用真實人臉與攻擊樣本來訓練FAS模型。在第3章至第4章,我們提出利用視覺轉換器來捕捉細微的偽造線索,學習具區分性的活體特徵,以有效區分真實人臉與偽造人臉。在單分類FAS部分,我們致力於解決單分類的挑戰,在該設定下,訓練階段僅依賴單一類別的數據,即真實人臉,來訓練FAS模型。在第5章至第6章,我們提出使用生成式特徵學習去生成潛在的偽造特徵,使其與真實人臉特徵區分開來,以模擬缺失的偽造類別,從而提升單分類人臉防偽部分的效果。我們在公開可用的FAS數據集上進行了廣泛的實驗,以評估所提出方法的有效性。實驗結果顯示,我們的方法在二分類與單分類人臉防偽設定下均具備優越的性能,並展現出潛在的應用價值。


    Facial recognition and identification systems have penetrated our daily lives to quickly facilitate many applications, such as cellphone unlock, authentication of on-line banking, and identification of criminals. However, extensive applications of facial recognition also incur potential security risks and require specific techniques to support the application security. Therefore, face anti-spoofing (FAS) has been proposed to distinguish live and spoof faces to counter facial presentation attacks, such as Print Attack (i.e., printing a face on a paper) and Replay Attack (i.e., replaying a face video on digital devices) for supporting the security of facial recognition and identification systems. In this dissertation, we focus on exploring various settings within FAS and divide this dissertation into two parts: two-class FAS and one-class FAS. For two-class FAS, we explore the use of complementary channel information and hybrid architectures to enhance the effectiveness of ViTs in addressing the face anti-spoofing task. For two-class FAS, we focus on incorporating complementary channel information or employing a hybrid architecture, i.e., Convolutional Neural Networks (CNN) with Vision Transformers (ViT), can effectively enhance the performance of ViTs on FAS tasks. For one-class FAS, we find that adopting generative feature learning effectively alleviates the challenge of lacking spoof training data. Additionally, incorporating richer auxiliary information—designed to simulate potential spoof attacks based on characteristics of known spoof types in real-world scenarios—can further enhance the performance of one-class FAS models. In the following paragraph, we provide an overview of the dissertation.

    First, in Chapter 1, we introduce the background of FAS, covering its fundamental concepts, common challenges, benchmark datasets, evaluation metrics, and contribution. Next, in Chapter 2, we review recent technical approaches in FAS under both twoclass and one-class settings, and introduce the background of Vision Transformers and Generative Feature Learning. Moreover, in the two-class FAS part, we focus on tackling the two-class FAS problem, where the training process involves both live faces and spoof attacks to develop FAS models. In Chapters 3-4, we propose leveraging ViTs to capture the subtle spoof cues to learn discriminative liveness features for distinguishing live from spoof faces. Furthermore, in the one-class FAS part, we focus on addressing the challenge of one-class FAS, where the training stage relies solely on one-class data (i.e., live faces) to develop FAS models. In Chapters 5-6, we propose using generative feature learning to generate latent spoof features distinct from the live class to simulate the absent spoof class and facilitate one-class FAS. We conduct extensive experiments on publicly available FAS datasets to evaluate the effectiveness of our methods. The experimental results show the superiority of our method and its potential applicability in both two-class and one-class FAS settings.

    Abstract (Chinese) I Acknowledgements (Chinese) III Abstract V Contents VII 1 Introduction to Face Anti-Spoof (FAS) 1 1.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Common Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Common Benchmarks and Evaluation Metrics . . . . . . . . . . . . . . 4 1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Related work 10 2.1 Two-class FAS methods . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 One-class FAS methods . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Vision Transformer: Self-Attention and Multi-Head Self-Attention . . . 13 2.4 Generative Feature Learning . . . . . . . . . . . . . . . . . . . . . . . 15 3 Two-class FAS: Channel Difference Transformer (CDformer) 17 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Complementary Channel Information . . . . . . . . . . . . . . . . . . 19 3.3 Channel Difference Transformer (CDformer) . . . . . . . . . . . . . . 22 3.3.1 Channel Difference Self-Attention (CDSA) . . . . . . . . . . . 22 3.3.2 Multi-Head Channel Difference Self-Attention (MCDSA) . . . 25 3.3.3 Channel Difference Transformer . . . . . . . . . . . . . . . . . 28 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . 30 3.4.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4.3 Intra-Domain Testing . . . . . . . . . . . . . . . . . . . . . . . 38 3.4.4 Cross-Domain Testing . . . . . . . . . . . . . . . . . . . . . . 38 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Two-class FAS: Enhancing Learnable Descriptive Convolutional Vision Transformer (LDCformer) 42 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2 LDCformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.2.1 Learnable Descriptive Convolution . . . . . . . . . . . . . . . 46 4.2.2 Learnable Descriptive Convolutional Vision Transformer . . . . 49 4.3 Enhancing LDCformer . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 Dual-Attention Supervision . . . . . . . . . . . . . . . . . . . 52 4.3.2 Self-Challenging Supervision . . . . . . . . . . . . . . . . . . 54 4.3.3 Transitional Triplet Mining . . . . . . . . . . . . . . . . . . . . 56 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4.1 Network Architecture and Implementation Details . . . . . . . 59 4.4.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4.3 Intra-Domain Testing . . . . . . . . . . . . . . . . . . . . . . . 66 4.4.4 Cross-Domain Testing . . . . . . . . . . . . . . . . . . . . . . 67 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 One-Class FAS: One-Class Spoof Cue Map Estimation Network (OC-SCMNet) 73 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2 OC-SCMNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 SCM-Guided Feature Learning . . . . . . . . . . . . . . . . . . 76 5.2.2 Feature-Enhanced SCM Estimation . . . . . . . . . . . . . . . 79 5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . 81 5.3.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3.3 Intra-Domain Testing . . . . . . . . . . . . . . . . . . . . . . . 86 5.3.4 Cross-Domain Testing . . . . . . . . . . . . . . . . . . . . . . 86 5.3.5 One-Shot Testing . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6 One-Class FAS: OC-SCMNet++ 92 6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2 Multiple Auxiliary Maps (MAM)-Guided Feature Learning . . . . . . 93 6.2.1 SCM-Guided Feature Learning . . . . . . . . . . . . . . . . . . 93 6.2.2 MAM-Guided Feature Learning . . . . . . . . . . . . . . . . . 97 6.2.3 Feature-Enhanced SCM Estimation . . . . . . . . . . . . . . . 102 6.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.3.1 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.3.2 Intra-Domain Testing . . . . . . . . . . . . . . . . . . . . . . . 105 6.3.3 Cross-Domain Testing . . . . . . . . . . . . . . . . . . . . . . 106 6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7 Conclusion and Future Work 109 References 111 A Derivation of Converting CDSA to SA and MCDSA to MSA 129 A.1 Derivation of Converting CDSA to SA . . . . . . . . . . . . . . . . . . 129 A.2 Derivation of Converting MCDSA to MSA . . . . . . . . . . . . . . . 134

    [1] Akshay Agarwal, Richa Singh, Mayank Vatsa, and Afzel Noore. Boosting face presentation attack detection in multi-spectral videos through score fusion of wavelet partition images. Frontiers in big Data, page 53, 2022.

    [2] Yashasvi Baweja, Poojan Oza, Pramuditha Perera, and Vishal M Patel. Anomaly detection-based unknown face presentation attack detection. In 2020 IEEE International Joint Conference on Biometrics (IJCB), pages 1–9. IEEE, 2020.

    [3] Ying Bian, Peng Zhang, Jingjing Wang, Chunmao Wang, and Shiliang Pu. Learning multiple explainable and generalizable cues for face anti-spoofing. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2310–2314. IEEE, 2022.

    [4] Zinelabidine Boulkenafet, Jukka Komulainen, and Abdenour Hadid. Face spoofing detection using colour texture analysis. IEEE Transactions on Information Forensics and Security, 11(8):1818–1830, 2016.

    [5] Zinelabinde Boulkenafet, Jukka Komulainen, Lei Li, Xiaoyi Feng, and Abdenour Hadid. Oulu-npu: A mobile face presentation attack database with real-world variations. In 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), pages 612–618. IEEE, 2017.

    [6] Baoliang Chen, Wenhan Yang, Haoliang Li, Shiqi Wang, and Sam Kwong. Camera invariant feature learning for generalized face anti-spoofing. IEEE Transactions on Information Forensics and Security, 16:2477–2492, 2021.

    [7] Zhihong Chen, Taiping Yao, Kekai Sheng, Shouhong Ding, Ying Tai, Jilin Li, Feiyue Huang, and Xinyu Jin. Generalizable representation learning for mixture domain face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1132–1139, 2021.

    [8] Ivana Chingovska, Andre Anjos, and Sebastien Marcel. On the effectiveness of local binary patterns in face anti-spoofing. In 2012 BIOSIG-proceedings of the international conference of biometrics special interest group (BIOSIG), pages 1–7. IEEE, 2012.

    [9] Ivana Chingovska, Andre Rabello Dos Anjos, and Sebastien Marcel. Biometrics evaluation under spoofing attacks. IEEE transactions on Information Forensics and Security, 9(12):2264–2276, 2014.

    [10] Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9):10850–10869, 2023.

    [11] Pengchao Deng, Chenyang Ge, Hao Wei, Yuan Sun, and Xin Qiao. Attentionaware dual-stream network for multimodal face anti-spoofing. IEEE Transactions on Information Forensics and Security, 2023.

    [12] Jeff Donahue, Philipp Krahenbuhl, and Trevor Darrell. Adversarial feature learning. In International Conference on Learning Representations, 2017.

    [13] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.

    [14] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.

    [15] Nesli Erdogmus and Sebastien Marcel. Spoofing face recognition with 3d masks. IEEE transactions on information forensics and security, 9(7):1084–1097, 2014.

    [16] Meiling Fang, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. Learnable multi-level frequency decomposition and hierarchical attention mechanism for generalized face presentation attack detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3722–3731, 2022.

    [17] Haocheng Feng, Zhibin Hong, Haixiao Yue, Yang Chen, Keyao Wang, Junyu Han, Jingtuo Liu, and Errui Ding. Learning generalized spoof cues for face anti-spoofing. arXiv preprint arXiv:2005.03922, 2020.

    [18] Tiago de Freitas Pereira, Andre Anjos, Jose Mario De Martino, and Sebastien Marcel. Lbp- top based countermeasure against face spoofing attacks. In Asian Conference on Computer Vision, pages 121–132. Springer, 2012.

    [19] Javier Galbally, Fernando Alonso-Fernandez, Julian Fierrez, and Javier OrtegaGarcia. A high performance fingerprint liveness detection method based on quality related features. Future Generation Computer Systems, 28(1):311–321, 2012.

    [20] Yu Gao, Xintong Han, Xun Wang, Weilin Huang, and Matthew Scott. Channel interaction networks for fine-grained image categorization. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 10818–10825, 2020.

    [21] Xinxu Ge, Xin Liu, Zitong Yu, Jingang Shi, Chun Qi, Jie Li, and Heikki Kalviainen. Difffas: face anti-spoofing via generative diffusion models. In European Conference on Computer Vision, pages 144–161. Springer, 2024.

    [22] Anjith George and Sebastien Marcel. Cross modal focal loss for rgbd face anti-spoofing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7882–7891, 2021.

    [23] Anjith George and Sebastien Marcel. On the effectiveness of vision transformers for zero-shot face anti-spoofing. In 2021 IEEE International Joint Conference on Biometrics (IJCB), pages 1–8. IEEE, 2021.

    [24] Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z Li. Towards fast, accurate and stable 3d dense face alignment. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

    [25] Xiao Guo, Yaojie Liu, Anil Jain, and Xiaoming Liu. Multi-domain learning for updating face anti-spoofing models. In European conference on computer vision, 2022.

    [26] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

    [27] Chengyang Hu, Ke-Yue Zhang, Taiping Yao, Shouhong Ding, and Lizhuang Ma. Rethinking generalizable face anti-spoofing via hierarchical prototype-guided distribution refinement in hyperbolic space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1032–1041, 2024.

    [28] H. Huang, Y. Xiang, G. Yang, L. Lv, X. Li, Z. Weng, and Y. Fu. Generalized face anti-spoofing via cross-adversarial disentanglement with mixing augmentation.In ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 2939–2943, 2022.

    [29] Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, and Ming-Hsuan Yang. Adaptive transformers for robust few-shot cross-domain face anti-spoofing. In European conference on computer vision, 2022.

    [30] Pei-Kai Huang, Chu-Ling Chang, Hui-Yu Ni, and Chiou-Ting Hsu. Learning to augment face presentation attack dataset via disentangled feature learning from limited spoof data. In 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2022.

    [31] Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen, Jun-Xiong Chong, TyngLuh Liu, and Chiou-Ting Hsu. One-class face anti-spoofing via spoof cue mapguided feature learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024.

    [32] Pei-Kai Huang, Cheng-Hsuan Chiang, Jun-Xiong Chong, Tzu-Hsien Chen, HuiYu Ni, and Chiou-Ting Hsu. Ldcformer: Incorporating learnable descriptive convolution to vision transformer for face anti-spoofing. In 2019 IEEE international conference on image processing (ICIP), 2023.

    [33] Pei-Kai Huang, Ming-Chieh Chin, and Chiou-Ting Hsu. Face anti-spoofing via robust auxiliary estimation and discriminative feature learning. In Asian Conference on Pattern Recognition, pages 443–458. Springer, 2021.

    [34] Pei-Kai Huang, Ming-Chieh Chin, and Chiou-Ting Hsu. Face anti-spoofing via robust auxiliary estimation and discriminative feature learning. In Asian Conference on Pattern Recognition, pages 443–458. Springer, 2022.

    [35] Pei-Kai Huang, Jun-Xiong Chong, Cheng-Hsuan Chiang, Tzu-Hsien Chen, TyngLuh Liu, and Chiou-Ting Hsu. Slip: Spoof-aware one-class face anti-spoofing with language image pretraining. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 3697–3706, 2025.

    [36] Pei-Kai Huang, Jun-Xiong Chong, Ming-Tsung Hsu, Fang-Yu Hsu, Cheng-Hsuan Chiang, Tzu-Hsien Chen, Chiou-Ting Hsu, et al. A survey on deep learning-based face anti-spoofing. APSIPA Transactions on Signal and Information Processing, 13(1), 2024.

    [37] Pei-Kai Huang, Jun-Xiong Chong, Ming-Tsung Hsu, Fang-Yu Hsu, and ChiouTing Hsu. Channel difference transformer for face anti-spoofing. Information Sciences, page 121904, 2025.

    [38] Pei-Kai Huang, Jun-Xiong Chong, Hui-Yu Ni, Tzu-Hsien Chen, and Chiou-Ting Hsu. Towards diverse liveness feature representation and domain expansion for cross-domain face anti-spoofing. In 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023.

    [39] Pei-Kai Huang, Chen-Yu Lu, Shu-Jung Chang, Jun-Xiong Chong, and Chiou-Ting Hsu. Test-time adaptation for robust face anti-spoofing. In British Machine Vision Conference (BMVC), 2023.

    [40] Pei-Kai Huang, Hui-Yu Ni, Yanqin Ni, and Chiou-Ting Hsu. Learnable descriptive convolutional network for face anti-spoofing. In British Machine Vision Conference (BMVC), volume 2, page 7, 2022.

    [41] Rui Huang and Xin Wang. Face anti-spoofing using feature distilling and global attention learning. Pattern Recognition, 135:109147, 2023.

    [42] Xiaobin Huang, Jingtian Xia, and Linlin Shen. One-class face anti-spoofing based on attention auto-encoder. In Biometric Recognition: 15th Chinese Conference, CCBR 2021, Shanghai, China, September 10–12, 2021, Proceedings 15, pages 365–373. Springer, 2021.

    [43] Zeyi Huang, Haohan Wang, Eric P Xing, and Dong Huang. Self-challenging improves cross-domain generalization. In European conference on computer vision. Springer, 2020.

    [44] Yunpei Jia, Jie Zhang, and Shiguang Shan. Dual-branch meta-learning network with distribution alignment for face anti-spoofing. IEEE Transactions on Information Forensics and Security, 17:138–151, 2021.

    [45] Yunpei Jia, Jie Zhang, Shiguang Shan, and Xilin Chen. Single-side domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8484–8493, 2020.

    [46] Amin Jourabloo, Yaojie Liu, and Xiaoming Liu. Face de-spoofing: Anti-spoofing via noise modeling. In Proceedings of the European Conference on Computer Vision (ECCV), pages 290–306, 2018.

    [47] Felix Juefei-Xu, Vishnu Naresh Boddeti, and Marios Savvides. Local binary convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 19–28, 2017.

    [48] Taewook Kim, YongHyun Kim, Inhan Kim, and Daijin Kim. Basn: Enriching feature representation using bipartite auxiliary supervisions for face anti-spoofing. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pages 0–0, 2019.

    [49] Jukka Komulainen, Abdenour Hadid, and Matti Pietikainen. Context based face anti-spoofing. In 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pages 1–8. IEEE, 2013.

    [50] Binh M Le and Simon S Woo. Gradient alignment for cross-domain face antispoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 188–199, 2024.

    [51] Guangjun Li, Yongxiong Wang, and Fengting Zhu. Multi-branch channel-wise enhancement network for fine-grained visual recognition. In Proceedings of the 29th ACM International Conference on Multimedia, 2021.

    [52] Jinxing Li, Bob Zhang, Guangming Lu, and David Zhang. Generative multi-view and multi-feature learning for classification. Information Fusion, 45:215–226, 2019.

    [53] Kaicheng Li, Hongyu Yang, Binghui Chen, Pengyu Li, Biao Wang, and Di Huang. Learning polysemantic spoof trace: A multi-modal disentanglement network for face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 1351–1359, 2023.

    [54] Xiaobai Li, Jukka Komulainen, Guoying Zhao, Pong-Chi Yuen, and Matti Pietikainen. Generalized face anti-spoofing by detecting pulse from face videos. In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 4244–4249. IEEE, 2016.

    [55] Zhi Li, Rizhao Cai, Haoliang Li, Kwok-Yan Lam, Yongjian Hu, and Alex C Kot. One-class knowledge distillation for face presentation attack detection. IEEE Transactions on Information Forensics and Security, 17:2137–2150, 2022.

    [56] Chen-Hao Liao, Wen-Cheng Chen, Hsuan-Tung Liu, Yi-Ren Yeh, Min-Chun Hu, and Chu-Song Chen. Domain invariant vision transformer learning for face antispoofing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 6098–6107, 2023.

    [57] Seokjae Lim, Yongjae Gwak, Wonjun Kim, Jong-Hyuk Roh, and Sangrae Cho. One-class learning method based on live correlation loss for face anti-spoofing. volume 8, pages 201635–201648. IEEE, 2020.

    [58] Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. Bilinear cnn models for fine-grained visual recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2015.

    [59] Xun Lin, Shuai Wang, Rizhao Cai, Yizhong Liu, Ying Fu, Zitong Yu, Wenzhong Tang, and Alex Kot. Suppress and rebalance: Towards generalized multi-modal face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

    [60] Ajian Liu and Yanyan Liang. Ma-vit: Modality-agnostic vision transformers for face anti-spoofing. arXiv preprint arXiv:2304.07549, 2023.

    [61] Ajian Liu, Zichang Tan, Jun Wan, Yanyan Liang, Zhen Lei, Guodong Guo, and Stan Z Li. Face anti-spoofing via adversarial cross-modality translation. IEEE Transactions on Information Forensics and Security, 16:2759–2772, 2021.

    [62] Ajian Liu, Zichang Tan, Zitong Yu, Chenxu Zhao, Jun Wan, Yanyan Liang Zhen Lei, Du Zhang, Stan Z Li, and Guodong Guo. Fm-vit: Flexible modal vision transformers for face anti-spoofing. IEEE Transactions on Information Forensics and Security, 2023.

    [63] Si-Qi Liu, Xiangyuan Lan, and Pong C Yuen. Remote photoplethysmography correspondence feature for 3d mask face presentation attack detection. In Proceedings of the European Conference on Computer Vision (ECCV), pages 558–573, 2018.

    [64] Siqi Liu, Pong C Yuen, Shengping Zhang, and Guoying Zhao. 3d mask face anti-spoofing with remote photoplethysmography. In European Conference on Computer Vision, pages 85–100. Springer, 2016.

    [65] Yaojie Liu, Amin Jourabloo, and Xiaoming Liu. Learning deep models for face anti-spoofing: Binary or auxiliary supervision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 389–398, 2018.

    [66] Yaojie Liu and Xiaoming Liu. Spoof trace disentanglement for generic face anti-spoofing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3813–3830, 2022.

    [67] Yaojie Liu, Joel Stehouwer, and Xiaoming Liu. On disentangling spoof trace for generic face anti-spoofing. In European Conference on Computer Vision, pages 406–422. Springer, 2020.

    [68] Yuchen Liu, Yabo Chen, Mengran Gou, Chun-Ting Huang, Yaoming Wang, Wenrui Dai, and Hongkai Xiong. Towards unsupervised domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.

    [69] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.

    [70] Yimei Ma, Jianjun Qian, Jun Li, and Jian Yang. Dual feature disentanglement for face anti-spoofing. Pattern Recognition, page 110656, 2024.

    [71] Jukka Maatta, Abdenour Hadid, and Matti Pietikainen. Face spoofing detection from single images using micro-texture analysis. In 2011 international joint conference on Biometrics (IJCB), pages 1–7. IEEE, 2011.

    [72] Sachin Mehta and Mohammad Rastegari. Mobilevit: Light-weight, generalpurpose, and mobile-friendly vision transformer. In International Conference on Learning Representations, 2021.

    [73] Kartik Narayan and Vishal M Patel. Hyp-oc: Hyperbolic one class classification for face anti-spoofing. arXiv preprint arXiv:2404.14406, 2024.

    [74] Olegs Nikisins, Amir Mohammadi, Andre Anjos, and Sebastien Marcel. On effectiveness of anomaly detection approaches against unseen presentation attacks in face anti-spoofing. In 2018 International Conference on Biometrics (ICB), pages 75–81. IEEE, 2018.

    [75] Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.

    [76] Keyurkumar Patel, Hu Han, and Anil K. Jain. Secure face unlock: Spoof detection on smartphones. IEEE Transactions on Information Forensics and Security, 11(10):2268–2283, 2016.

    [77] Yunxiao Qin, Zitong Yu, Longbin Yan, Zezheng Wang, Chenxu Zhao, and Zhen Lei. Meta-teacher for face anti-spoofing. IEEE transactions on pattern analysis and machine intelligence, 2021.

    [78] Raghavendra Ramachandra and Christoph Busch. Presentation attack detection methods for face recognition systems: A comprehensive survey. ACM Computing Surveys (CSUR), 50(1):1–37, 2017.

    [79] Mohammad Rostami, Leonidas Spinoulas, Mohamed Hussein, Joe Mathai, and Wael Abd-Almageed. Detection and continual learning of novel face presentation attacks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 14851–14860, 2021.

    [80] Nilay Sanghvi, Sushant Kumar Singh, Akshay Agarwal, Mayank Vatsa, and Richa Singh. Mixnet for generalized face presentation attack detection. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 5511–5518. IEEE, 2021.

    [81] Rui Shao, Xiangyuan Lan, Jiawei Li, and Pong C Yuen. Multi-adversarial discriminative deep domain generalization for face presentation attack detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10023–10031, 2019.

    [82] Rui Shao, Xiangyuan Lan, and Pong C Yuen. Regularized fine-grained meta face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 11974–11981, 2020.

    [83] Yichen Shi, Yuhao Gao, Yingxin Lai, Hongyang Wang, Jun Feng, Lei He, Jun Wan, Changsheng Chen, Zitong Yu, and Xiaochun Cao. Shield: An evaluation benchmark for face spoofing and forgery detection with multimodal large language models. arXiv preprint arXiv:2402.04178, 2024.

    [84] Koushik Srivatsan, Muzammal Naseer, and Karthik Nandakumar. Flip: Crossdomain face anti-spoofing with language guidance. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19685–19696, 2023.

    [85] Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikainen, and Li Liu. Pixel difference networks for efficient edge detection. ¨ In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021.

    [86] Yiyou Sun, Yaojie Liu, Xiaoming Liu, Yixuan Li, and Wen-Sheng Chu. Rethinking domain generalization for face anti-spoofing: Separability and alignment.In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023.

    [87] Z. Sun and X. Li. Contrast-phys: Unsupervised video-based remote physiological measurement via spatiotemporal contrast. In European conference on computer vision, pages 492–510, 2022.

    [88] Yu Tian, Yalin Huang, Kunbo Zhang, Yue Liu, and Zhenan Sun. Polarized image translation from nonpolarized cameras for multimodal face anti-spoofing. IEEE Transactions on Information Forensics and Security, 2023.

    [89] Xiaoguang Tu, Zheng Ma, Jian Zhao, Guodong Du, Mei Xie, and Jiashi Feng. Learning generalizable and identity-discriminative representations for face antispoofing. ACM Transactions on Intelligent Systems and Technology (TIST), 11(5):1–19, 2020.

    [90] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.

    [91] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.

    [92] Caixun Wang, Bingyao Yu, and Jie Zhou. A learnable gradient operator for face presentation attack detection. Pattern Recognition, 135:109146, 2023.

    [93] Chien-Yi Wang, Y.D. Lu, S.T. Yang, and S.H. Lai. Patchnet: A simple face anti-spoofing framework via fine-grained patch recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022.

    [94] Guoqing Wang, Hu Han, Shiguang Shan, and Xilin Chen. Cross-domain face presentation attack detection via multi-domain disentangled representation learning.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6678–6687, 2020.

    [95] Jingjing Wang, Jingyi Zhang, Ying Bian, Youyi Cai, Chunmao Wang, and Shiliang Pu. Self-domain adaptation for face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2746–2754, 2021.

    [96] W. Wang, P. Liu, H. Zheng, R. Ying, and F. Wen. Domain generalization for face anti-spoofing via negative data augmentation. IEEE Transactions on Information Forensics and Security, 18:2333–2344, 2023.

    [97] Weihang Wang, Fei Wen, Haoyuan Zheng, Rendong Ying, and Peilin Liu. Convmlp: A convolution and mlp mixed model for multimodal face anti-spoofing. IEEE Trans. Inf. Forensics Secur., 17:2284–2297, 2022.

    [98] Yu-Chun Wang, Chien-Yi Wang, and Shang-Hong Lai. Disentangled representation with dual-stage feature learning for face anti-spoofing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1955–1964, 2022.

    [99] Zezheng Wang, Zitong Yu, Chenxu Zhao, Xiangyu Zhu, Yunxiao Qin, Qiusheng Zhou, Feng Zhou, and Zhen Lei. Deep spatial gradient and temporal depth learning for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5042–5051, 2020.

    [100] Zhuo Wang, Qiangchang Wang, Weihong Deng, and Guodong Guo. Face antispoofing using transformers with relation-aware mechanism. IEEE Transactions on Biometrics, Behavior, and Identity Science, 4(3):439–450, 2022.

    [101] Zhuo Wang, Qiangchang Wang, Weihong Deng, and Guodong Guo. Learning multi-granularity temporal characteristics for face anti-spoofing. IEEE Transactions on Information Forensics and Security, 17:1254–1269, 2022.

    [102] Zhuo Wang, Zezheng Wang, Zitong Yu, Weihong Deng, Jiahong Li, Tingting Gao, and Zhongyuan Wang. Domain generalization via shuffled style assembly for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4123–4133, 2022.

    [103] Di Wen, Hu Han, and Anil K Jain. Face spoof detection with image distortion analysis. IEEE Transactions on Information Forensics and Security, 10(4):746– 761, 2015.

    [104] Yiqiang Wu, Dapeng Tao, Yong Luo, Jun Cheng, and Xuelong Li. Covered style mining via generative adversarial networks for face anti-spoofing. Pattern Recognition, 132:108957, 2022.

    [105] Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollar, and Ross Girshick. Early convolutions help transformers see better. 2021.

    [106] Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, and Bolei Zhou. Generative hierarchical features from synthesizing images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4432–4442, 2021.

    [107] Jianwei Yang, Zhen Lei, Shengcai Liao, and Stan Z Li. Face liveness detection with component dependent descriptor. In 2013 International Conference on Biometrics (ICB), pages 1–6. IEEE, 2013.

    [108] Xiao Yang, Wenhan Luo, Linchao Bao, Yuan Gao, Dihong Gong, Shibao Zheng, Zhifeng Li, and Wei Liu. Face anti-spoofing: Model matters, so does data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3507–3516, 2019.

    [109] William J Youden. Index for rating diagnostic tests. Cancer, 3(1):32–35, 1950.

    [110] Chaojian Yu, Xinyi Zhao, Qi Zheng, Peng Zhang, and Xinge You. Hierarchical bilinear pooling for fine-grained visual recognition. In European conference on computer vision, 2018.

    [111] Zitong Yu, Rizhao Cai, Yawen Cui, Xin Liu, Yongjian Hu, and Alex C Kot. Rethinking vision transformer and masked autoencoder in multimodal face antispoofing. International Journal of Computer Vision, pages 1–22, 2024

    [112] Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, and Guoying Zhao. Face antispoofing with human material perception. In European Conference on Computer Vision, pages 557–575. Springer, 2020.

    [113] Zitong Yu, Xiaobai Li, Jingang Shi, Zhaoqiang Xia, and Guoying Zhao. Revisiting pixel-wise supervision for face anti-spoofing. IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(3):285–295, 2021.

    [114] Zitong Yu, Xiaobai Li, Pichao Wang, and Guoying Zhao. Transrppg: Remote photoplethysmography transformer for 3d mask face presentation attack detection. IEEE Signal Processing Letters, 28:1290–1294, 2021.

    [115] Zitong Yu, Yunxiao Qin, Xiaobai Li, Zezheng Wang, Chenxu Zhao, Zhen Lei, and Guoying Zhao. Multi-modal face anti-spoofing based on central difference networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 650–651, 2020.

    [116] Zitong Yu, Yunxiao Qin, Hengshuang Zhao, Xiaobai Li, and Guoying Zhao. Dual-cross central difference network for face anti-spoofing. In IJCAI, 2021.

    [117] Zitong Yu, Jun Wan, Yunxiao Qin, Xiaobai Li, Stan Z Li, and Guoying Zhao. Nas-fas: Static-dynamic central difference network search for face anti-spoofing. IEEE transactions on pattern analysis and machine intelligence, 43(9):3005–3023, 2020.

    [118] Zitong Yu, Chenxu Zhao, Zezheng Wang, Yunxiao Qin, Zhuo Su, Xiaobai Li, Feng Zhou, and Guoying Zhao. Searching central difference convolutional networks for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5295–5305, 2020.

    [119] Zhenxun Yuan, Xiao Song, Lei Bai, Zhe Wang, and Wanli Ouyang. Temporalchannel transformer for 3d lidar-based video object detection for autonomous driving. IEEE Transactions on Circuits and Systems for Video Technology, 32(4):2068– 2078, 2021.

    [120] Ke-Yue Zhang, Taiping Yao, Jian Zhang, Ying Tai, Shouhong Ding, Jilin Li, Feiyue Huang, Haichuan Song, and Lizhuang Ma. Face anti-spoofing via disentangled representation learning. In European Conference on Computer Vision, pages 641–657. Springer, 2020.

    [121] Shifeng Zhang, Ajian Liu, Jun Wan, Yanyan Liang, Guodong Guo, Sergio Escalera, Hugo Jair Escalante, and Stan Z Li. Casia-surf: A large-scale multi-modal benchmark for face anti-spoofing. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2(2):182–193, 2020.

    [122] Yuanhan Zhang, ZhenFei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, and Ziwei Liu. Celeba-spoof: Large-scale face anti-spoofing dataset with rich annotations. In European Conference on Computer Vision, pages 70–85. Springer, 2020.

    [123] Zhiwei Zhang, Junjie Yan, Sifei Liu, Zhen Lei, Dong Yi, and Stan Z Li. A face antispoofing database with diverse attacks. In 2012 5th IAPR international conference on Biometrics (ICB), pages 26–31. IEEE, 2012.

    [124] Tianyi Zheng, Bo Li, Shuang Wu, Ben Wan, Guodong Mu, Shice Liu, Shouhong Ding, and Jia Wang. Mfae: Masked frequency autoencoders for domain generalization face anti-spoofing. IEEE Transactions on Information Forensics and Security, 2024.

    [125] Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Shouhong Ding, and Lizhuang Ma. Test-time domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 175–187, 2024.

    [126] Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Ran Yi, Shouhong Ding, and Lizhuang Ma. Instance-aware domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023.

    [127] Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, and Lizhuang Ma. Generative domain adaptation for face anti-spoofing. In European conference on computer vision. Springer, 2022.

    [128] Hang Zou, Chenxi Du, Hui Zhang, Yuan Zhang, Ajian Liu, Jun Wan, and Zhen Lei. La-softmoe clip for unified physical-digital face attack detection. arXiv preprint arXiv:2408.12793, 2024.

    QR CODE