通過生成虛假評論對基於評論的黑盒推薦系統進行托攻擊

簡易檢索 / 詳目顯示

回結果列表

研究生：	姜宏昀 Chiang, Hung-Yun
論文名稱：	通過生成虛假評論對基於評論的黑盒推薦系統進行托攻擊 Shilling Black-box Review-based Recommender Systems through Fake Review Generation
指導教授：	帥宏翰 Shuai, Hong-Han 張俊盛 Chang, Jyun-Sheng
口試委員:	高宏宇 Kao, Hung-Yu 李政德 Li, Cheng-Te
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	53
中文關鍵詞：	推薦系統、基於評論的推薦系統、托攻擊、強化學習、文字生成、評論生成、觀點生成
外文關鍵詞：	Review-based Recommender System, Shilling Attacks, Reinforcement Learning for Text, Review Generation, Aspect Generation
相關次數：	點閱：142 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

由於基於評論的推薦系統 (RBRS) 利用評論構建了用戶和物品的表示特徵，可以緩解眾所周知的冷啟動問題，因此吸引了越來越多的研究興趣。然而，本文認為這種對評論的依賴反而可能使系統暴露於被托攻擊 (shilling attacks)的風險。為了探索這種可能性，本文提出了第一種針對 RBRS 的基於生成的文字攻擊模型。具體而言，我們通過強化學習學習了一個假評論生成器 (Attack Review Generator)，該生成器通過添加生成的評論使推薦系統的預測偏移，從而惡意推銷物品。通過引入輔助獎勵(rewards)來增加文本流暢性和多樣性，並藉助預訓練語言模型和觀點生成器，生成的評論可以高度保真地進行托攻擊。實驗結果表明，我們所提出的框架可以成功地攻擊基於 Amazon 三種不同類型語料庫以及 Yelp 下所訓練的 RBRSs。此外，人類的評估還表明生成的評論是流暢且信息豐富的。最後，配備了攻擊評論生成器 (ARG) 的 RBRSs 經過對抗性訓練後能夠增強對惡意評論的抵抗性。

Review-Based Recommender Systems (RBRS) have attracted increasing research interest due to their ability to alleviate well-known cold-start problems. RBRS utilizes reviews to construct the user and items representations. However, in this thesis, we argue that such a reliance on reviews may instead expose systems to the risk of being shilled. To explore this possibility, in this thesis, we propose the first generation-based model for shilling attacks against RBRSs. Specifically, we learn a fake review generator through reinforcement learning, which maliciously promotes items by forcing prediction shifts after adding generated reviews to the system. By introducing the auxiliary rewards to increase text fluency and diversity with the aid of pre-trained language models and aspect predictors, the generated reviews can be effective for shilling with high fidelity. Experimental results demonstrate that the proposed framework can successfully attack three different kinds of RBRSs on the Amazon corpus with three domains and Yelp corpus. Furthermore, human studies also show that the generated reviews are fluent and informative. Finally, equipped with Attack Review Generators (ARGs), RBRSs with adversarial training are much more robust to malicious reviews.

Abstract i
摘要ii
致謝iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related Work 7
2.1 Attack on Recommender System . . . . . . . . . . . . . . . . . . . 7
2.2 Review-based Recommendation System . . . . . . . . . . . . . . . . 8
2.2.1 Text Generation Attacks . . . . . . . . . . . . . . . . . . . . 9
3 Problem Formulation 10
4 Methodology 12
4.1 Attack Review Generator . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Reinforcement Learning for Attack . . . . . . . . . . . . . . . . . . 13
iv
4.2.1 Prediction shift reward . . . . . . . . . . . . . . . . . . . . . 15
4.2.2 Inverse Perplexity reward . . . . . . . . . . . . . . . . . . . 15
4.2.3 Relevance reward . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Aspect Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Training Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Experiments 20
5.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 RQ1: Attack Performance . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 RQ2: Quality of Attack Reviews . . . . . . . . . . . . . . . . . . . 26
5.4 RQ3: Human Evaluation and Detector . . . . . . . . . . . . . . . . 30
5.5 RQ4: Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 Ethical Considerations 37
7 Conclusion 38
References 39
Appendices 46
A My Appendix 46
A.1 Results of ABAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A.2 Adversarial Training Results on Yelp Dataset . . . . . . . . . . . . 47
A.3 ChatGPT as Fake Review Classifier . . . . . . . . . . . . . . . . . . 47
A.4 Examples of Different Attack Reviews . . . . . . . . . . . . . . . . 48
A.5 Alternative Metrics for Evaluating Review Quality . . . . . . . . . 48
                                

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions.
IEEE Trans Knowl Data Eng, 734–749.
Bražinskas, A., Lapata, M., & Titov, I. (2020). Few-shot learning for opinion
summarization. EMNLP, 4119–4135.
Bražinskas, A., Lapata, M., & Titov, I. (2021). Learning opinion summarizers by
selecting informative reviews. EMNLP, 9424–9442.
Burke, R., Mobasher, B., Williams, C., & Bhaumik, R. (2006). Classification features
for attack detection in collaborative recommender systems. KDD,
542–547.
Chen, C., Zhang, M., Liu, Y., & Ma, S. (2018). Neural attentional rating regression
with review-level explanations. WWW , 1583–1592.
Chen, J., Fan, W., Zhu, G., Zhao, X., Yuan, C., Li, Q., & Huang, Y. (2022).
Knowledge-enhanced black-box attacks for recommendations. KDD, 108–
117.
Chen, X., Du, Y., Xia, L., & Wang, J. (2021). Reinforcement recommendation
with user multi-aspect preference. WWW , 425–435.
39
Cheng, Z., & Hurley, N. (2009). Effective diverse and obfuscated attacks on modelbased
recommender systems. RecSys, 141–148.
Cohen, R., Sar Shalom, O., Jannach, D., & Amir, A. (2021). A black-box attack
model for visually-aware recommender systems. WSDM, 94–102.
Ding, N., & Soricut, R. (2017). Cold-start reinforcement learning with softmax
policy gradient. NeurIPS.
Dognin, P., Padhi, I., Melnyk, I., & Das, P. (2021). ReGen: Reinforcement learning
for text and knowledge base generation using pretrained language models.
EMNLP, 1084–1099.
Dong, X., Ni, J., Cheng, W., Chen, Z., Zong, B., Song, D., Liu, Y., Chen, H.,
& de Melo, G. (2020). Asymmetrical hierarchical networks with attentive
interactions for interpretable review-based recommendation. AAAI, 7667–
7674.
Dong, Y., Fu, Q.-A., Yang, X., Pang, T., Su, H., Xiao, Z., & Zhu, J. (2020).
Benchmarking adversarial robustness on image classification. CVPR.
Ebrahimi, J., Rao, A., Lowd, D., & Dou, D. (2018). HotFlip: White-box adversarial
examples for text classification. ACL, 31–36.
Fang, M., Gong, N. Z., & Liu, J. (2020). Influence function based data poisoning
attacks to top-n recommender systems. WWW , 3019–3025.
Fang, M., Yang, G., Gong, N. Z., & Liu, J. (2018). Poisoning attacks to graphbased
recommender systems. ACSAC, 381–392.
Gao, J., Lin, Y., Wang, Y., Wang, X., Yang, Z., He, Y., & Chu, X. (2020). Setsequence-
graph: A multi-view approach towards exploiting reviews for recommendation.
CIKM, 395–404.
40
Gunes, I., Kaleli, C., Bilge, A., & Polat, H. (2014). Shilling attacks against recommender
systems: A comprehensive survey. Artif. Intell. Rev., 42, 767–
799.
Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. J., & Dragan, A. (2017).
NeurIPS, 30.
He, B., Ahamad, M., & Kumar, S. (2021). Petgen: Personalized text generation
attack on deep sequence embedding-based classification models. KDD, 575–
584.
He, R., Lee, W. S., Ng, H. T., & Dahlmeier, D. (2017). An unsupervised neural
attention model for aspect extraction. ACL, 388–397.
Huang, H., Mu, J., Gong, N. Z., Li, Q., Liu, B., & Xu, M. (2021). Data poisoning
attacks to deep learning based recommender systems. NDSS.
Jnoub, N., Brankovic, A., & Klas, W. (2021). Fact-checking reasoning system for
fake review detection using answer set programming. Algorithms, 14, 190.
Kaur, P., & Goel, S. (2016). Shilling attack models in recommender system.
ICICT, 2, 1–5.
Lam, S. K., & Riedl, J. (2004). Shilling recommender systems for fun and profit.
WWW , 393–402.
Le, T., Wang, S., & Lee, D. (n.d.). Malcom: Generating malicious comments to
attack neural fake news detection models. ICDM, 282–291.
Leino, J., & Räihä, K.-J. (2007). Case amazon: Ratings and reviews as part of
recommendations. RecSys, 137–140.
Li, B., Wang, Y., Singh, A., & Vorobeychik, Y. (2016). Data poisoning attacks on
factorization-based collaborative filtering. NeurIPS, 29.
41
Li, J., Ji, S., Du, T., Li, B., & Wang, T. (2019). Textbugger: Generating adversarial
text against real-world applications. NDSS.
Lin, C., Chen, S., Li, H., Xiao, Y., Li, L., & Yang, Q. (2020). Attacking recommender
systems with augmented user profiles. CIKM, 855–864.
Lin, C., Chen, S., Zeng, M., Zhang, S., Gao, M., & Li, H. (2022). Shilling blackbox
recommender systems by learning to generate fake user profiles. IEEE
Trans. Neural Netw. Learn. Syst., 1–15.
Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries.
Text Summarization Branches Out, 74–81.
Liu, D., Li, J., Du, B., Chang, J., & Gao, R. (2019). Daml: Dual attention mutual
learning between ratings and reviews for item recommendation. KDD, 344–
352.
Liu, H., Wu, F., Wang, W., Wang, X., Jiao, P., Wu, C., & Xie, X. (2019). Nrpa:
Neural recommendation with personalized attention. SIGIR, 1233–1236.
Liu, M., Yang, E., Xiong, D., Zhang, Y., Meng, Y., Hu, C., Xu, J., & Chen, Y.
(2020). A learning-exploring method to generate diverse paraphrases with
multi-objective deep reinforcement learning. CICLing, 2310–2321.
Lu, Y., Rai, H., Chang, J., Knyazev, B., Yu, G., Shekhar, S., Taylor, G. W., &
Volkovs, M. (2021). Context-aware scene graph generation with seq2seq
transformers. ICCV , 15931–15941.
Luo, S., Lu, X., Wu, J., & Yuan, J. (2021). Review-aware neural recommendation
with cross-modality mutual attention. CIKM, 3293–3297.
42
Ma, J., Gao, W., & Wong, K.-F. (2019). Detect rumors on twitter by promoting
information campaigns with generative adversarial learning. WWW , 3049–
3055.
Mobasher, B., Burke, R., Bhaumik, R., & Williams, C. (2007). Toward trustworthy
recommender systems: An analysis of attack models and algorithm
robustness. ACM Trans. Internet Technol., 23–es.
Morris, J. X., Lifland, E., Yoo, J. Y., & Qi, Y. (2020). Textattack: A framework
for adversarial attacks in natural language processing. CoRR.
Ni, J., & McAuley, J. (2018). Personalized review generation by expanding phrases
and attending on aspect-aware representations. ACL, 706–711.
O’Donovan, J., & Smyth, B. (2006). Is trust robust? an analysis of trust-based
recommendation. IUI, 101–108.
Pan, A., Bhatia, K., & Steinhardt, J. (2022). The effects of reward misspecification:
Mapping and mitigating misaligned models. ICLR.
Pang, M., Gao, W., Tao, M., & Zhou, Z.-H. (2018). Unorganized malicious attacks
detection. NeurIPS, 31.
Paul, H., & Nikolaev, A. (2021). Fake review detection on online e-commerce
platforms: A systematic literature review. Data Min Knowl Discov, 35,
1830–1881.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019).
Language models are unsupervised multitask learners. OpenAI blog, 1(8),
9.
Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J., & Goel, V. (2017). Self-critical
sequence training for image captioning. CVPR, 7008–7024.
43
Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: Introduction
and challenges. Recommender systems handbook (pp. 1–34).
Shuai, J., Zhang, K., Wu, L., Sun, P., Hong, R., Wang, M., & Li, Y. (2022). A
review-aware graph contrastive learning framework for recommendation.
SIGIR, 1283–1293.
Song, J., Li, Z., Hu, Z., Wu, Y., Li, Z., Li, J., & Gao, J. (2020). Poisonrec: An
adaptive data poisoning framework for attacking black-box recommender
systems. ICDE, 157–168.
Sun, P., Wu, L., Zhang, K., Su, Y., & Wang, M. (2021). An unsupervised aspectaware
recommendation model with explanation text generation. ACM Trans.
Inf. Syst.
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient
methods for reinforcement learning with function approximation. NeurIPS.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. NeurIPS, 30.
Wallace, E., Feng, S., Kandpal, N., Gardner, M., & Singh, S. (2019). Universal
adversarial triggers for attacking and analyzing NLP. EMNLP-IJCNLP,
2153–2162.
Wang, J., Wen, R., Wu, C., Huang, Y., & Xiong, J. (2019). Fdgars: Fraudster
detection via graph convolutional networks in online app review system.
WWW , 310–316.
Wang, X., Zhang, X., Jiang, C., & Liu, H. (2018). Identification of fake reviews
using semantic and behavioral features. ICIM, 92–97.
44
Wang, Z., Huang, N., Sun, F., Ren, P., Chen, Z., Luo, H., de Rijke, M., & Ren, Z.
(2022). Debiasing learning for membership inference attacks against recommender
systems. KDD, 1959–1968.
Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist
reinforcement learning. Mach. Learn., 8, 229–256.
Wu, C., Lian, D., Ge, Y., Zhu, Z., & Chen, E. (2021). Triple adversarial learning
for influence based poisoning attack in recommender systems. KDD, 1830–
1840.
Wu, C., Wu, F., Qi, T., Ge, S., Huang, Y., & Xie, X. (2019). Reviews meet graphs:
Enhancing user and item representations for recommendation with hierarchical
attentive graph neural network. EMNLP-IJCNLP, 4884–4893.
Xu, Y., Wu, B., Shen, F., Fan, Y., Zhang, Y., Shen, H. T., & Liu, W. (2019). Exact
adversarial attack to image captioning via structured output learning with
latent variables. CVPR.
Zhang, H., Tian, C., Li, Y., Su, L., Yang, N., Zhao, W. X., & Gao, J. (2021).
Data poisoning attack against recommender system using incomplete and
perturbed data. KDD, 2154–2164.
Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender
system: A survey and new perspectives. CSUR, 1–38.
Zheng, L., Noroozi, V., & Yu, P. S. (2017). Joint deep modeling of users and items
using reviews for recommendation. WSDM, 425–434.

簡易檢索 / 詳目顯示

相關論文