研究生: |
程薰瑩 Cheng, Hsun-Ying |
---|---|
論文名稱: |
基於語意分割預測深度影像以幫助三維人臉辨識 Segmentation-Aided Depth Map Estimation for RGB-D Face Recognition |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: |
林嘉文
Lin, Chia-Wen 邱瀞德 Chiu, Ching-Te 林彥宇 Lin, Yen-Yu |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 英文 |
論文頁數: | 43 |
中文關鍵詞: | 3D人臉辨識 、深度學習 、電腦視覺 、生成對抗網路 |
外文關鍵詞: | 3D Face Recognition, Deep Learning, Computer Vision, Generative Adversarial Network |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來由於深度學習的興起,在眾多領域都有相當飛速的進步,這之中也包含了人臉辨識。頂尖的二維人臉辨識已能辨識生活中大部分的人臉,然而在側臉或是角度變化大的情況下,辨識準確率往往會有大幅度的降低。另一方面,三維人臉辨識由於多了深度資訊,對於側臉的辨識率較二維人臉辨識高,但因為當前三維人臉的資料量過於缺乏,使得難以訓練出一個強大的模型。在本論文中,我們提出了一個可以將RGB影像轉為深度影像的生成對抗網路,稱作DepthGAN,來幫助提升在角度變化大的人臉辨識率。為了可以精準的分辨輸入人臉影像的五官位置以及人臉輪廓,我們還增加了一個語意分割的步驟,將語意分割的資訊和原本的RGB輸入影像合併在一起,來引導DepthGAN生成更加可信的深度影像資訊。接著我們將現有的大量RGB影像轉換成深度影像來實作三維人臉辨識,並實驗在三維及二維的測試資料上。實驗結果顯示,DepthGAN除了可以實現側臉的深度轉換,在表情的深度轉換上也比目前使用人臉重建產生的深度影像更加接近真實深度影像。人臉辨識的實驗結果也表明,用我們產生的RGB-D人臉資料訓練出來的模型,不論在二維或是三維資料上都能夠有效的提升人臉辨識準確率。
With the emergence of deep learning technology, we have witnessed rapid advances on many computer vision tasks, including face recognition. However, although 2D face recognition can achieve extremely high accuracy on frontal images, the accuracy is degraded for recognizing face images with large pose variations. 3D face recognition is considered to be robust under pose variation, but the numbers of subjects in public 3D face datasets are not large enough to train an accurate 3D face recognition model. In this paper, we propose a novel framework that estimates depth maps from RGB face images by including a semantic segmentation module for more accurate face region localization. With the segmentation mask, our depth generation module can produce more reliable depth estimation especially for faces with large pose angles. We produce a large-scale RGB-D dataset by converting VGGFace2 dataset to train a robust face recognition model. Our experiments demonstrate that the proposed method can provide more accurate depth estimation under large pose and expression variations. In addition, the proposed system improves face recognition accuracy for both 2D and 3D face recognition tasks on several public datasets.
[1] Abate, A. F., Nappi, M., Riccio, D., and Sabatino, G. 2d and 3d face recognition: A survey. Pattern Recogn. Lett. 28, 14 (Oct. 2007), 1885–1906.
[2] Blanz, V., Scherbaum, K., and Seidel, H.-P. Fitting a morphable model to 3d scans of faces. IEEE 11th International Conference on Computer Vision, ICCV2007, IEEE, 4409029.1-8 (2007) (01 2007).
[3] Blanz, V., and Vetter, T. Face recognition based on fitting a 3d morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 9 (2003), 1063–1074.
[4] Cao, Q., Shen, L., Xie, W., Parkhi, O. M., and Zisserman, A. Vggface2: A dataset for recognising faces across pose and age. In International Conference
on Automatic Face and Gesture Recognition (2018).
[5] Deng, J., Guo, J., Xue, N., and Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 4690–4699.
[6] Drira, H., Ben Amor, B., Srivastava, A., Daoudi, M., and Slama, R. 3d face recognition under expressions,occlusions and pose variations. IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (09 2013), 2270–83.
[7] Faltemier, T. C., Bowyer, K. W., and Flynn, P. J. Using a multi-instance enrollment representation to improve 3d face recognition. In 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems (2007), pp. 1–6.
[8] Feng, Y., Wu, F., Shao, X., Wang, Y., and Zhou, X. Joint 3d face reconstruction and dense alignment with position map regression network, 2018.
[9] Gilani, S. Z., and Mian, A. Learning from millions of 3d scans for large-scale 3d face recognition. CoRR abs/1711.05942 (2017).
[10] Gross, R., Matthews, I., Cohn, J., Kanade, T., and Baker, S. Multi-pie. In 2008 8th IEEE International Conference on Automatic Face Gesture Recognition (2008), pp. 1–8.
[11] Guo, Y., Zhang, L., Hu, Y., He, X., and Gao, J. MS-Celeb-1M: A dataset and benchmark for large scale face recognition. In European Conference on Computer Vision (2016), Springer.
[12] Gupta, S., Castleman, K. R., Markey, M. K., and Bovik, A. C. Texas 3d face recognition database. In 2010 IEEE Southwest Symposium on Image Analysis Interpretation (SSIAI) (2010), pp. 97–100.
[13] Hu, J., Shen, L., and Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018),pp. 7132–7141.
[14] Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. Labeled faces in the wild: A database for studying face recognition in unconstrained envi-
ronments. Tech. Rep. 07-49, University of Massachusetts, Amherst, October 2007.
[15] Isola, P., Zhu, J., Zhou, T., and Efros, A. A. Image-to-image translation with conditional adversarial networks. CoRR abs/1611.07004 (2016).
[16] Jiang, L., Zhang, J., and Deng, B. Robust rgb-d face recognition using attribute-aware loss, 2018.
[17] Kakadiaris, I. A., Passalis, G., Toderici, G., Murtuza, M. N., Lu, Y., Karampatziakis, N., and Theoharis, T. Three-dimensional face recognition in the presence of facial expressions: An annotated deformable model approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 4 (2007),
640–649.
[18] Kim, D., Hernandez, M., Choi, J., and Medioni, G. G. Deep 3d face identification. CoRR abs/1703.10714 (2017).
[19] Lee, C.-H., Liu, Z., Wu, L., and Luo, P. Maskgan: Towards diverse and interactive facial image manipulation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).
[20] Lee, Y., Chen, J., Tseng, C., and Lai, S.-H. Accurate and robust face recognition from rgb-d images with a deep learning approach. pp. 123.1–123.14.
[21] Lei, Y., Guo, Y., Hayat, M., Bennamoun, M., and Zhou, X. A two-phase weighted collaborative representation for 3d partial face recognition with single sample. Pattern Recognition 52 (10 2015).
[22] Li, H., Huang, d., Morvan, J. M., and Chen, L. Towards 3d face recognition in the real: A registration-free approach using fine-grained matching of 3d
keypoint descriptors. International Journal of Computer Vision 113 (06 2015).
[23] Li, H., Huang, D., Morvan, J.-M., Wang, Y., and Chen, L. Towards 3d face recognition in the real: A registration-free approach using fine-grained match-
ing of 3d keypoint descriptors. Int. J. Comput. Vision 113, 2 (June 2015), 128–142.
[24] Li, H., Sun, J., Xu, Z., and Chen, L. Multimodal 2d+3d facial expression recognition with deep fusion convolutional neural network. IEEE Transactions on Multimedia 19, 12 (2017), 2816–2831.
[25] Lijun Yin, Xiaozhou Wei, Yi Sun, Jun Wang, and Rosato, M. J. A 3d facial expression database for facial behavior research. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06) (2006), pp. 211–216.
[26] Lin, S., Liu, F., Liu, Y., and Shen, L. Local feature tensor based deep learning for 3d face recognition. In 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019) (2019), pp. 1–5.
[27] Mian, A., Bennamoun, M., and Owens, R. An efficient multimodal 2d-3d hybrid approach to automatic face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 11 (2007), 1927–1943.
[28] Mian, A., Bennamoun, M., and Owens, R. Keypoint detection and local feature matching for textured 3d face recognition. International Journal of Computer Vision 79 (08 2008), 1–12.
[29] Patil, H., Kothari, A., and Bhurchandi, K. 3-d face recognition: Features, databases, algorithms and challenges. Artif. Intell. Rev. 44, 3 (Oct. 2015), 393–441.
[30] Paysan, P., Knothe, R., Amberg, B., Romdhani, S., and Vetter, T. A 3d face model for pose and illumination invariant face recognition.
[31] Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Jin Chang, Hoffman, K., Marques, J., Jaesik Min, and Worek, W. Overview of the face recognition grand challenge. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (2005), vol. 1, pp. 947–954 vol. 1.
[32] Qian, Y., Deng, W., and Hu, J. Unsupervised face normalization with extreme pose and expression in the wild. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019).
[33] Ronneberger, O., Fischer, P., and Brox, T. U-net: Convolutional networks for biomedical image segmentation. CoRR abs/1505.04597 (2015).
[34] Savran, A., Alyuz, N., Dibeklioglu, H., Celiktutan, O., Gokberk, B., Sankur, B., and Akarun, L. Bosphorus database for 3d face analysis. pp. 47–56.
[35] Sengupta, S., Chen, J.-C., Castillo, C., Patel, V. M., Chellappa, R., and Jacobs, D. W. Frontal to profile face verification in the wild. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) (2016), IEEE, pp. 1–9.
[36] Uppal, H., Sepas-Moghaddam, A., Greenspan, M., and Etemad, A. Attention-aware fusion for rgb-d face recognition, 2020.
[37] Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., and Liu, W. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5265–5274.
[38] Xiong, X., Wen, X., and Huang, C. Improving rgb-d face recognition via transfer learning from a pretrained 2d network. In International Symposium on Benchmarking, Measuring and Optimization (Bench’19). Springer (2019).
[39] Yi, D., Lei, Z., Liao, S., and Li, S. Z. Learning face representation from scratch. CoRR abs/1411.7923 (2014).
[40] Ying, C., Lei, Y., Yang, M., You, Z., and Shan, S. A fast and robust 3d face recognition approach based on deeply learned face representation. Neurocomputing 363 (07 2019).
[41] Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. Bisenet: Bi-lateral segmentation network for real-time semantic segmentation. CoRR abs/1808.00897 (2018).
[42] Zhang, Z., Da, F., and Yu, Y. Data-free point cloud network for 3d face recognition, 2019.
[43] Zhao, J., Cheng, Y., Xu, Y., Xiong, L., Li, J., Zhao, F., Jayashree, K., Pranata, S., Shen, S., Xing, J., Yan, S., and Feng, J. Towards pose invariant face recognition in the wild. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), pp. 2207–2216.
[44] Zheng, T., and Deng, W. Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments. Tech. Rep. 18-01, Beijing University of Posts and Telecommunications, February 2018.
[45] Zhou, S., and Xiao, S. 3d face recognition: a survey. Human-centric Computing and Information Sciences 8 (12 2018).
[46] Zhu, X., Liu, X., Lei, Z., and Li, S. Z. Face alignment in full pose range: A 3d total solution. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1 (Jan 2019), 78–92.