簡易檢索 / 詳目顯示

研究生: 鐘偉豪
Chung, Wei-Hao
論文名稱: 領域泛化遠程心率估計網路:具有領域置換和領域擴增的分離特徵學習
Domain Generalized RPPG Network: Disentangled Feature Learning with Domain Permutation and Domain Augmentation
指導教授: 許秋婷
Hsu, Chiou-Ting
口試委員: 賴尚宏
Lai, Shang-Hong
吳馬丁
Torbjörn, Nordling
陳佩君
Chen, Trista
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 34
中文關鍵詞: 遠程心率估計領域泛化領域擴增解離式學習
外文關鍵詞: rPPG estimation, domain generalization, domain augmentation, disentanglement
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 遠程光體積變化描記圖提供了一種非接觸式方法,用於監測來自面部影像的生理信號。現有的基於學習的方法雖然在數據集內測試的情況下相當有效,但在跨數據集測試上效能嚴重退化。在本論文中,我們將跨數據集測試作為領域泛化問題來解決,並提出了一種新穎的領域泛化遠程心率估計網路來學習領域泛化遠程心率估計器。為此,我們開發了一個特徵分離學習框架,以從輸入面部影像中分離 遠程光體積變化描記圖、身份和領域特徵。接著,我們提出了一種領域置換策略,以進一步約束解開的遠程光體積變化描記圖特徵對不同領域保持不變。最後,我們設計了一種新的對抗領域增強策略來擴大領域泛化遠程心率估計網路的域範圍。我們的實驗結果表明,在 UBFC-rPPG、PURE、COHFACE和 VIPL-HR數據集的多個跨領域測試設置中,我們所提出的領域泛化遠程心率估計網路皆優於其他遠程心率估計方法。


    Remote photoplethysmography (rPPG) offers a contactless method for monitoring physiological signals from facial videos. Existing learning-based methods, although work effectively on intra-dataset scenarios, degrade severely on cross-dataset testing. In this thesis, we address the cross-dataset testing as a domain generalization problem and propose a novel DG-rPPGNet to learn a domain generalized rPPG estimator. To this end, we develop a feature disentangled learning framework to disentangle rPPG, identity, and domain features from input facial videos. Next, we propose a domain permutation strategy to further constrain the disentangled rPPG features to be invariant to different domains. Finally, we design a novel adversarial domain augmentation strategy to enlarge the domain sphere of DG-rPPGNet. Our experimental results show that DG-rPPGNet outperforms other rPPG estimation methods in many cross-domain settings on UBFC-rPPG, PURE, COHFACE, and VIPL-HR datasets.

    摘要 . . . . . . . . . . . . . . . . . . . i Abstract . . . . . . . . . . . . . . . . . . . ii Acknowledgements. . . . . . . . . . . . . . . . . . . iii Introduction . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . 4 2.1 Remote Photoplethysmography Estimation . . . . . . . . . . . . . . . . . . . 4 2.2 Feature Disentanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Domain Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Method . . . . . . . . . . . . . . . . . . . 8 3.1 Problem Statement and Overview of DGrPPGNet . . . . . . . . . . . . . . . 8 3.2 Disentangled Feature Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Domain Permutation for DomainInvariant Feature Learning . . . . . . . . . . 11 3.4 Domain Augmentation via AdaIN . . . . . . . . . . . . . . . . . . . . . . . . 13 3.5 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.6 Inference Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Experiments . . . . . . . . . . . . . . . . . . . 18 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.1 UBFCrPPG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 PURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.3 COHFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.4 VIPLHR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.1 CrossDomain Setting . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.5 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.6 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.6.1 Global Feature Encoder . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.6.2 Feature Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.6.3 Feature Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.6.4 RPPG Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.6.5 Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.7 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.8 Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.9 Evaluation of Fewshot Domain Adaptation . . . . . . . . . . . . . . . . . . . 28 5 Conclusion . . . . . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . 31

    [1] G. De Haan and V. Jeanne, “Robust pulse rate from chrominancebased
    rppg,” IEEE Transactions on Biomedical Engineering, vol. 60, no. 10, pp. 2878–2886, 2013.
    [2] W. Verkruysse, L. O. Svaasand, and J. S. Nelson, “Remote plethysmographic imaging using ambient light.,” Optics express, vol. 16, no. 26, pp. 21434–21445, 2008.
    [3] M.Z. Poh, D. J. McDuff, and R. W. Picard, “Noncontact,
    automated cardiac pulse measurements using video imaging and blind source separation.,” Optics express, vol. 18, no. 10, pp. 10762–10774, 2010.
    [4] W. Wang, A. C. Den Brinker, S. Stuijk, and G. De Haan, Algorithmic principles of remote ppg,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 7, pp. 1479–1491, 2016.
    [5] X. Li, J. Chen, G. Zhao, and M. Pietikainen, “Remote heart rate measurement from face videos under realistic situations,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4264–4271, 2014.
    [6] W. Wang, S. Stuijk, and G. De Haan, “A novel algorithm for remote photoplethysmography: Spatial subspace rotation,” IEEE transactions on biomedical engineering, vol. 63, no. 9, pp. 1974–1984, 2015.
    [7] G. De Haan and V. Jeanne, “Robust pulse rate from chrominancebased
    rppg,” IEEE Transactions on Biomedical Engineering, vol. 60, no. 10, pp. 2878–2886, 2013.
    [8] M.Z. Poh, D. J. McDuff, and R. W. Picard, “Advancements in noncontact, multiparameter physiological measurements using a webcam,” IEEE transactions on biomedical engineering, vol. 58, no. 1, pp. 7–11, 2010.
    [9] R. Song, H. Chen, J. Cheng, C. Li, Y. Liu, and X. Chen, “Pulsegan: Learning to generate realistic pulse waveforms in remote photoplethysmography,” IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 5, pp. 1373–1384, 2021.
    [10] Y.Y. Tsou, Y.A. Lee, and C.T. Hsu, “Multitask learning for simultaneous video generation and remote photoplethysmography estimation,” in Proceedings of the Asian Conference on Computer Vision, 2020.
    [11] F. Bousefsaf, A. Pruski, and C. Maaoui, “3d convolutional neural networks for remote pulse rate measurement and mapping from facial video,” Applied Sciences, vol. 9, no. 20, p. 4364, 2019.
    [12] W. Chen and D. McDuff, “Deepphys: Videobased physiological measurement using convolutional attention networks,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 349–365, 2018.
    [13] E. Lee, E. Chen, and C.Y. Lee, “Metarppg: Remote heart rate estimation using a transductive metalearner,” in European Conference on Computer Vision, pp. 392–409, Springer, 2020.
    [14] H. Lu, H. Han, and S. K. Zhou, “Dualgan: Joint bvp and noise modeling for remote physiological measurement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12404–12413, 2021.
    [15] X. Niu, H. Han, S. Shan, and X. Chen, “Synrhythm: Learning a deep heart rate estimator from general to specific,” in 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3580–3585, IEEE, 2018.
    [16] R. Špetlík, V. Franc, and J. Matas, “Visual heart rate estimation with convolutional neural network,” in Proceedings of the british machine vision conference, Newcastle, UK, pp. 3–6, 2018.
    [17] Y.Y. Tsou, Y.A. Lee, C.T. Hsu, and S.H. Chang, “Siameserppg network: Remote photoplethysmography signal estimation from face videos,” in Proceedings of the 35th annual ACM symposium on applied computing, pp. 2066–2073, 2020.
    [18] Z. Yu, W. Peng, X. Li, X. Hong, and G. Zhao, “Remote heart rate measurement from highly compressed facial videos: an endtoend deep learning solution with video enhancement,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 151–160, 2019.
    [19] Z. Yu, Y. Shen, J. Shi, H. Zhao, P. H. Torr, and G. Zhao, “Physformer: facial videobased physiological measurement with temporal difference transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4186–4196, 2022.
    [20] S. Bobbia, R. Macwan, Y. Benezeth, A. Mansouri, and J. Dubois, “Unsupervised skin tissue segmentation for remote photoplethysmography,” Pattern Recognition Letters, vol. 124, pp. 82–90, 2019.
    [21] G. Heusch, A. Anjos, and S. Marcel, “A reproducible study on remote heart rate measurement,” arXiv preprint arXiv:1709.00962, 2017.
    [22] R. Stricker, S. Müller, and H.M. Gross, “Noncontact
    videobased pulse rate measurement on a mobile service robot,” in The 23rd IEEE International Symposium on Robot and Human Interactive Communication, pp. 1056–1062, IEEE, 2014.
    [23] Y. Balaji, S. Sankaranarayanan, and R. Chellappa, “Metareg: Towards domain generalization using metaregularization,” Advances in neural information processing systems, vol. 31, 2018.
    [24] Y. Li, Y. Yang, W. Zhou, and T. Hospedales, “Featurecritic
    networks for heterogeneous domain generalization,” in International Conference on Machine Learning, pp. 3915–3924, PMLR, 2019.
    [25] Z. Wang, Y. Luo, R. Qiu, Z. Huang, and M. Baktashmotlagh, “Learning to diversify for single domain generalization,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 834–843, 2021.
    [26] D. Li, Y. Yang, Y.Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain generalization,” in Proceedings of the IEEE international conference on computer vision, pp. 5542–5550, 2017.
    [27] C. Lin, Z. Yuan, S. Zhao, P. Sun, C. Wang, and J. Cai, “Domaininvariant disentangled network for generalizable object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8771–8780, 2021.
    [28] X. Yue, Y. Zhang, S. Zhao, A. SangiovanniVincentelli,
    K. Keutzer, and B. Gong, “Domain randomization and pyramid consistency: Simulationtoreal generalization without accessing target domain data,” in Proceedings of the IEEE/CVF International Conference
    on Computer Vision, pp. 2100–2110, 2019.
    [29] K. Zhou, Y. Yang, T. Hospedales, and T. Xiang, “Deep domainadversarial image generation for domain generalisation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13025–13032, 2020.
    [30] L. Li, K. Gao, J. Cao, Z. Huang, Y. Weng, X. Mi, Z. Yu, X. Li, and B. Xia, “Progressive domain expansion network for single domain generalization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 224–233, 2021.
    [31] S. Shankar, V. Piratla, S. Chakrabarti, S. Chaudhuri, P. Jyothi, and S. Sarawagi, “Generalizing across domains via crossgradient
    training,” arXiv preprint arXiv:1804.10745, 2018.
    [32] D. Kim, Y. Yoo, S. Park, J. Kim, and J. Lee, “Selfreg: Selfsupervised contrastive regularization for domain generalization,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9619–9628, 2021.
    [33] G. Wang, H. Han, S. Shan, and X. Chen, “Crossdomain face presentation attack detection via multidomain disentangled representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6678–6687, 2020.
    [34] S. Lee, S. Cho, and S. Im, “Dranet: Disentangling representation and adaptation networks for unsupervised crossdomain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15252–15261, 2021.
    [35] X. Niu, Z. Yu, H. Han, X. Li, S. Shan, and G. Zhao, “Videobased
    remote physiological measurement via crossverified feature disentangling,” in European Conference on Computer Vision, pp. 295–310, Springer, 2020.
    [36] T. Karras, S. Laine, and T. Aila, “A stylebased generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
    [37] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
    [38] Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in International conference on machine learning, pp. 1180–1189, PMLR, 2015.
    [39] X. Niu, H. Han, S. Shan, and X. Chen, “Viplhr: A multimodal
    database for pulse estimation from lessconstrained face video,” in Asian Conference on Computer Vision, pp. 562–576, Springer, 2018.
    [40] A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks),” in International Conference on Computer Vision, 2017.
    [41] S. Woo, J. Park, J.Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), pp. 3–19, 2018.
    [42] D. McDuff and E. Blackford, “iphys: An open noncontact imagingbased physiological measurement toolbox,” in 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp. 6521–6524, IEEE, 2019.

    QR CODE