研究生: |
翁正欣 Weng, Cheng-Hsin |
---|---|
論文名稱: |
探討對攻擊例的穩健性與後門攻擊間的交互影響 Exploiting Adversarial Robustness In Backdoor Attacks |
指導教授: |
吳尚鴻
Wu, Shan-Hung |
口試委員: |
彭文志
Peng, Wen-Chih 李哲榮 Lee, Che-Rung 邱維辰 Chiu, Wei-Chen |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 12 |
中文關鍵詞: | 攻擊例 、後門攻擊 |
外文關鍵詞: | adversarial example, backdoor attack |
相關次數: | 點閱:68 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
深度神經網路很容易受到對抗例攻擊和後門攻擊,過去很多的防禦方法被提出來保護神經網路免於它們其中一種攻擊,但是沒有防禦方法去討論當兩種攻擊被放在一起時,它們之間是否存在某種交互作用,使得神經網路更加脆弱。在我的論文裡,我們做實驗去討論兩種攻擊是否會互相影響然後發現了某種權衡──當神經網路增加對對抗例攻擊的健壯性的同時,它也變得更容易被後門攻擊。我們接著調查這個現象的原因和展示了這個權衡可以被一個攻擊者用來打破一些防禦後門攻擊的防禦方法。我們的發現警示了我們未來在設計神經網路的防禦方法或是安全性評估時,必須同時考慮到對抗例攻擊和後門攻擊,以避免在安全性上有錯誤的認知。
Deep neural networks are shown to be susceptible to both adversarial attacks and backdoor attacks. Although many defenses against an individual type of the above attacks have been proposed, the interactions between the vulnerabilities of a network to both types of attacks have not been carefully investigated yet. In my paper, we conduct experiments to study whether adversarial robustness and backdoor robustness can affect each other and find a trade-off—by increasing the robustness of a network to adversarial examples, the network becomes more vulnerable to backdoor attacks. We then investigate the cause and show how such a trade-off can be exploited by an adversary to break some existing backdoor defenses. Our findings suggest that future research on defense should take both adversarial and backdoor attacks into account when designing algorithms or robustness measures to avoid pitfalls and a false sense of security.
1. Athalye, A., Carlini, N., and Wagner, D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420, 2018. 2
2. Carlini, N. and Wagner, D. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proc. of the 10th ACM Workshop on Artificial Intelligence and Security, 2017a. 2
3. Carlini, N. and Wagner, D. Towards evaluating the robustness of neural networks. In Proc. of IEEE Symposium on Security and Privacy(S&P), 2017b. 2
4. Chen, B., Carvalho, W., Baracaldo, N., Ludwig, H., Edwards, B., Lee, T., Molloy, I., and Srivastava, B. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728, 2018. 1, 2, 4.2, 4.2
5. Chen, X., Liu, C., Li, B., Lu, K., and Song, D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017. 1, 2, 4.1
6. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proc. of CVPR. Ieee, 2009. 1
7. Goodfellow, I. J., Shlens, J., and Szegedy, C. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. 1, 2
8. Gu, T., Liu, K., Dolan-Gavitt, B., and Garg, S. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, 2019. 1, 2, 4.1, 4.2
9. Hein, M. and Andriushchenko, M. Formal guarantees on the robustness of a classifier against adversarial manipulation. In Proc. of NIPS, 2017. 1, 2, 3.1, 2
10. Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., and Madry, A. Adversarial examples are not bugs, they are features. In Proc. of NeurIPS, 2019. 3.2
11. Jakubovitz, D. and Giryes, R. Improving dnn robustness to adversarial attacks using jacobian regularization. In Proc. of ECCV, 2018. 1, 2
12. Kannan, H., Kurakin, A., and Goodfellow, I. Adversarial logit pairing. arXiv preprint arXiv:1803.06373, 2018. 1, 2
13. Krizhevsky, A. and Hinton, G. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009. 1
14. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proc. of the IEEE, 86(11), 1998. 1
15. Lin, J., Gan, C., and Han, S. Defensive quantization: When efficiency meets robustness. arXiv preprint arXiv:1904.08444, 2019. 1, 2
16. Liu, Y., Ma, S., Aafer, Y., Lee, W.-C., Zhai, J., Wang, W., and Zhang, X. Trojaning attack on neural networks. In Proc. of the Network and Distributed System Security Symposium (NDSS), 2017. 1, 2, 4.1
17. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017. 1, 2, 3.1
18. Mahloujifar, S., Diochnos, D. I., and Mahmoody, M. The curse of concentration in robust learning: Evasion and poisoning attacks from concentration of measure. In Proc. of AAAI, 2019. 2
19. Qian, H. and Wegman, M. N. L2-nonexpansive neural networks. arXiv preprint arXiv:1802.07896, 2018. 1, 2
20. Qiao, X., Yang, Y., and Li, H. Defending neural backdoors via generative distribution modeling. In Proc. of NeurIPS, 2019. 1, 2, 5
21. Shafahi, A., Huang, W. R., Najibi, M., Suciu, O., Studer, C., Dumitras, T., and Goldstein, T. Poison frogs! targeted clean-label poisoning attacks on neural networks. In Proc. of NeurIPS, 2018. 2, 3.1, 4.1
22. Shafahi, A., Najibi, M., Ghiasi, M. A., Xu, Z., Dickerson, J., Studer, C., Davis, L. S., Taylor, G., and Goldstein, T. Adversarial training for free! In Proc. of NeurIPS, 2019. 1, 2
23. Shan, S., Willson, E., Wang, B., Li, B., Zheng, H., and Zhao, B. Y. Gotta catch’em all: Using concealed trapdoors to detect adversarial attacks on neural networks. arXiv preprint arXiv:1904.08554, 2019. 2
24. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. 1, 2
25. Tran, B., Li, J., and Madry, A. Spectral signatures in backdoor attacks. In Proc. of NeurIPS, 2018. 1, 2, 4.2, 4.2, 9 11
26. Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., and Madry, A. Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152, 2018. 3.2
27. Turner, A., Tsipras, D., and Madry, A. Label-consistent backdoor attacks. arXiv preprint arXiv:1912.02771, 2019. 2, 3.1, 4.1
28. Wang, B., Yao, Y., Shan, S., Li, H., Viswanath, B., Zheng, H., and Zhao, B. Y. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In Proc. of IEEE Symposium on Security and Privacy (S&P), 2019. 1, 2, 4.2, 9, 6, 5
29. Xie, C., Wu, Y., Maaten, L. v. d., Yuille, A. L., and He, K. Feature denoising for improving adversarial robustness. In Proc. of CVPR, 2019. 1, 2, 3.1
30. Zhu, C., Huang, W. R., Shafahi, A., Li, H., Taylor, G., Studer, C., and Goldstein, T. Transferable clean-label poisoning attacks on deep neural nets. arXiv preprint arXiv:1905.05897, 2019. 2, 3.1, 4.1