研究生: |
劉 杰 Liu, Chieh |
---|---|
論文名稱: |
用於多模態資料異常檢測之擴散模型學習技術 Learning Diffusion Models for Multi-View Anomaly Detection |
指導教授: |
陳煥宗
Chen, Hwann-Tzong |
口試委員: |
賴尚宏
Lai, Shang-Hong 許秋婷 Hsu, Chiou-Ting 劉庭祿 Liu, Tyng-Luh |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2024 |
畢業學年度: | 113 |
語文別: | 英文 |
論文頁數: | 41 |
中文關鍵詞: | 異常檢測 、擴散模型 、多模態 |
相關次數: | 點閱:71 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文探索異常檢測(AD)中的一種新方法,可以同時且明確地生成同一目標物體的多個觀察樣本,以解決僅使用單一觀察樣本可能無法有效捕捉潛在缺陷的限制。更具體地說,針對一個特定場景,我們替其中每個目標物體,建立起與七個不同的資料模態的關聯。前六個模態涉及使用靜態攝影機在六種不同的光照條件下拍攝影像,而第七個模態涉及 3D 法向量資訊。我們稱此任務為多模態異常檢測。
為了解決這個問題,我們提出的方法包括訓練一個跨模態的 ControlNet,它可以生成一致的特徵圖,不論是何種資料模態。這種訓練策略使我們能夠減輕光照條件變化的影響,並有效地融合來自 RGB 顏色外觀和 3D 法向量幾何的資訊。此外,由於擴散過程不是確定性的,我們使用 DDIM 方法來提高我們已建立的基於擴散特徵的記憶庫在異常檢測推理中的適用性。為了證明我們方法的有效性,我們在 Eyecandies 資料集上進行了廣泛的對照實驗,並且展示了與最先進方法的實驗比較結果。
This thesis explores an emerging formulation in anomaly detection (AD) where multiple instances of the same object are produced simultaneously and distinctly to address the limitation that using only a single instance may not effectively capture any underlying defects. More specifically, we concentrate on a specific scenario where each object of interest is linked to seven distinct data views/representations. The first six views involve capturing images with a stationary camera under six different lighting conditions, while the seventh view pertains to the 3D normal information. We refer to our intended task as {\em multi-view anomaly detection}. To tackle this problem, our approach involves training a view-invariant ControlNet that can produce consistent feature maps regardless of the data views. This training strategy enables us to mitigate the impact of varying lighting conditions and to fuse information from both the RGB color appearance and the 3D normal geometry effectively. Moreover, as the diffusion process is not deterministic, we utilize the denoising diffusion implicit model (DDIM) scheme to improve the applicability of our established memory banks of diffusion-based features for anomaly detection inference.
To demonstrate the efficacy of our approach, we present extensive ablation studies and state-of-the-art experimental results on the Eyecandies dataset.
[1] J. Bae, J. Lee, and S. Kim. PNI: industrial anomaly detection using position and neighborhood information. In ICCV, 2023.
[2] K. Batzner, L. Heckler, and R. König. Efficientad: Accurate visual anomaly detection at millisecond-level latencies. CoRR, 2023.
[3] P. Bergmann, K. Batzner, M. Fauser, D. Sattlegger, and C. Steger. The mvtec anomaly detection dataset: A comprehensive real-world dataset for unsupervised anomaly detection. Int. J. Comput. Vis., 2021.
[4] P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In CVPR, 2020.
[5] P. Bergmann, X. Jin, D. Sattlegger, and C. Steger. The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization. In G. M. Farinella, P. Radeva, and K. Bouatouch, editors, VISIGRAPP, 2022.
[6] P. Bergmann and D. Sattlegger. Anomaly detection in 3d point clouds using deep geometric descriptors. In WACV, 2023.
[7] L. Bonfiglioli, M. Toschi, D. Silvestri, N. Fioraio, and D. D. Gregorio. The eyecandies dataset for unsupervised multimodal anomaly detection and localization. In L. Wang, J. Gall, T. Chin, I. Sato, and R. Chellappa, editors, ACCV, 2022.
[8] R. Chen, G. Xie, J. Liu, J. Wang, Z. Luo, J. Wang, and F. Zheng. Easynet: An easy network for 3d industrial anomaly detection. In A. El-Saddik, T. Mei, R. Cucchiara, M. Bertini, D. P. T. Vallejo, P. K. Atrey, and M. S. Hossain, editors, International Conference on Multimedia, 2023.
[9] Y. Chu, C. Liu, T. Hsieh, H. Chen, and T. Liu. Shape-guided dual-memory learning for 3d anomaly detection. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors, ICML, 2023.
[10] N. Cohen and Y. Hoshen. Sub-image anomaly detection with deep pyramid correspondences. CoRR, 2020.
[11] A. Costanzino, P. Zama Ramirez, G. Lisanti, and L. Di Stefano. Multimodal industrial anomaly detection by crossmodal feature mapping. In CVPR, 2024.
[12] T. Defard, A. Setkov, A. Loesch, and R. Audigier. Padim: A patch distribution modeling framework for anomaly detection and localization. In A. D. Bimbo, R. Cucchiara, S. Sclaroff, G. M. Farinella, T. Mei, M. Bertini, H. J. Escalante, and R. Vezzani, editors, Pattern Recognition. ICPR International Workshops and Challenges, 2020.
[13] P. Dhariwal and A. Q. Nichol. Diffusion models beat gans on image synthesis. In M. Ranzato, A. Beygelzimer, Y. N. Dauphin, P. Liang, and J. W. Vaughan, editors, NeurIPS, 2021.
[14] Z. Gu, J. Zhang, L. Liu, X. Chen, J. Peng, Z. Gan, G. Jiang, A. Shu, Y. Wang, and L. Ma. Rethinking reverse distillation for multi-modal anomaly detection. Proceedings of the AAAI Conference on Artificial Intelligence, 2024.
[15] H. He, J. Zhang, H. Chen, X. Chen, Z. Li, X. Chen, Y. Wang, C. Wang, and L. Xie. Diad: A diffusion-based framework for multi-class anomaly detection. CoRR, 2023.
[16] J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, NeurIPS, 2020.
[17] J. Ho and T. Salimans. Classifier-free diffusion guidance. CoRR, 2022.
[18] E. Horwitz and Y. Hoshen. Back to the feature: Classical 3d features are (almost) all you need for 3d anomaly detection. In CVPR, 2023.
[19] T. Hu, J. Zhang, R. Yi, Y. Du, X. Chen, L. Liu, Y. Wang, and C. Wang. Anomalydiffusion: Few-shot anomaly image generation with diffusion model. CoRR, 2023.
[20] D. A. Hudson, D. Zoran, M. Malinowski, A. K. Lampinen, A. Jaegle, J. L. McClelland, L. Matthey, F. Hill, and A. Lerchner. SODA: bottleneck diffusion models for representation learning, 2023.
[21] X. Ju, A. Zeng, Y. Bian, S. Liu, and Q. Xu. Direct inversion: Boosting diffusion-based editing with 3 lines of code, 2023.
[22] J. Karras, A. Holynski, T. Wang, and I. Kemelmacher-Shlizerman. Dreampose: Fashion image-to-video synthesis via stable diffusion. In ICCV, 2023.
[23] S. Lee, S. Lee, and B. C. Song. CFA: coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. IEEE Access, 2022.
[24] C. Li, K. Sohn, J. Yoon, and T. Pfister. Cutpaste: Self-supervised learning for anomaly detection and localization. In CVPR, 2021.
[25] J. Liu, G. Xie, R. Chen, X. Li, J. Wang, Y. Liu, C. Wang, and F. Zheng. Real3d-ad: A dataset of point cloud anomaly detection, 2023.
[26] Z. Liu, J. P. Zhou, Y. Wang, and K. Q. Weinberger. Unsupervised out-of-distribution detection with diffusion inpainting. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors, ICML, 2023.
[27] Z. Liu, Y. Zhou, Y. Xu, and Z. Wang. Simplenet: A simple network for image anomaly detection and localization. In CVPR, 2023.
[28] F. Lu, X. Yao, C. Fu, and J. Jia. Removing anomalies as noises for industrial defect localization. In ICCV, 2023.
[29] G. Luo, L. Dunlap, D. H. Park, A. Holynski, and T. Darrell. Diffusion hyperfeatures: Searching through time and space for semantic correspondence. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, NeurIPS, 2023.
[30] R. Mokady, A. Hertz, K. Aberman, Y. Pritch, and D. Cohen-Or. Null-text inversion for editing real images using guided diffusion models. In CVPR, 2023.
[31] A. Mousakhan, T. Brox, and J. Tayyub. Anomaly detection with conditioned denoising diffusion models. CoRR, 2023.
[32] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
[33] K. Roth, L. Pemula, J. Zepeda, B. Schölkopf, T. Brox, and P. V. Gehler. Towards total recall in industrial anomaly detection. In CVPR, 2022.
[34] M. Rudolph, T. Wehrbein, B. Rosenhahn, and B. Wandt. Fully convolutional cross-scale-flows for image-based defect detection. In Winter Conference on Applications of Computer Vision (WACV), 2022.
[35] M. Rudolph, T. Wehrbein, B. Rosenhahn, and B. Wandt. Asymmetric student-teacher networks for industrial anomaly detection. In WACV, 2023.
[36] H. M. Schlüter, J. Tan, B. Hou, and B. Kainz. Natural synthetic anomalies for self-supervised anomaly detection and localization. In S. Avidan, G. J. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, editors, ECCV, 2022.
[37] C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman, P. Schramowski, S. Kundurthy, K. Crowson, L. Schmidt, R. Kaczmarczyk, and J. Jitsev. Laion-5b: An open large-scale dataset for training next generation image-text models, 2022.
[38] L. Tang, M. Jia, Q. Wang, C. P. Phoo, and B. Hariharan. Emergent correspondence from image diffusion. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, NeurIPS, 2023.
[39] N. Tumanyan, M. Geyer, S. Bagon, and T. Dekel. Plug-and-play diffusion features for text-driven image-to- image translation. In CVPR, 2023.
[40] Y. Wang, J. Peng, J. Zhang, R. Yi, Y. Wang, and C. Wang. Multimodal industrial anomaly detection via hybrid fusion. In CVPR, 2023.
[41] A. Wu, D. Chen, and C. Deng. Deep feature deblurring diffusion for detecting out-of-distribution objects. In ICCV, 2023.
[42] J. Wyatt, A. Leach, S. M. Schmon, and C. G. Willcocks. Anoddpm: Anomaly detection with denoising diffusion probabilistic models using simplex noise. In CVPR, 2022.
[43] V. Zavrtanik, M. Kristan, and D. Skocaj. Cheating depth: Enhancing 3d surface anomaly detection via depth simulation. CoRR, 2023.
[44] V. Zavrtanik, M. Kristan, and D. Skoaj. Draem – a discriminatively trained reconstruction embedding for surface anomaly detection, 2021.
[45] V. Zavrtanik, M. Kristan, and D. Skoaj. Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, 2021.
[46] H. Zhang, Z. Wang, Z. Wu, and Y.-G. Jiang. Diffusionad: Norm-guided one-step denoising diffusion for anomaly detection, 2023.
[47] L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-image diffusion models. In ICCV, 2023.
[48] X. Zhang, N. Li, J. Li, T. Dai, Y. Jiang, and S. Xia. Unsupervised surface anomaly detection with diffusion probabilistic model. In ICCV, 2023.
[49] X. Zhang, S. Li, X. Li, P. Huang, J. Shan, and T. Chen. Destseg: Segmentation guided denoising student-teacher for anomaly detection. In CVPR, 2023.
[50] Y. Zheng, X. Wang, Y. Qi, W. Li, and L. Wu. Benchmarking unsupervised anomaly detection and localization. CoRR, 2022.
[51] Q. Zhou, W. Li, L. Jiang, G. Wang, G. Zhou, S. Zhang, and H. Zhao. Pad: A dataset and benchmark for pose-agnostic anomaly detection, 2023.