簡易檢索 / 詳目顯示

研究生: 姜宏昀
Chiang, Hung-Yun
論文名稱: 通過生成虛假評論對基於評論的黑盒推薦系統進行托攻擊
Shilling Black-box Review-based Recommender Systems through Fake Review Generation
指導教授: 帥宏翰
Shuai, Hong-Han
張俊盛
Chang, Jyun-Sheng
口試委員: 高宏宇
Kao, Hung-Yu
李政德
Li, Cheng-Te
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 53
中文關鍵詞: 推薦系統基於評論的推薦系統托攻擊強化學習文字生成評論生成觀點生成
外文關鍵詞: Review-based Recommender System, Shilling Attacks, Reinforcement Learning for Text, Review Generation, Aspect Generation
相關次數: 點閱:67下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於基於評論的推薦系統 (RBRS) 利用評論構建了用戶和物品的表示特徵,可以緩解眾所周知的冷啟動問題,因此吸引了越來越多的研究興趣。然而,本文認為這種對評論的依賴反而可能使系統暴露於被托攻擊 (shilling attacks)的風險。為了探索這種可能性,本文提出了第一種針對 RBRS 的基於生成的文字攻擊模型。具體而言,我們通過強化學習學習了一個假評論生成器 (Attack Review Generator),該生成器通過添加生成的評論使推薦系統的預測偏移,從而惡意推銷物品。通過引入輔助獎勵(rewards)來增加文本流暢性和多樣性,並藉助預訓練語言模型和觀點生成器,生成的評論可以高度保真地進行托攻擊。 實驗結果表明,我們所提出的框架可以成功地攻擊基於 Amazon 三種不同類型語料庫以及 Yelp 下所訓練的 RBRSs。此外,人類的評估還表明生成的評論是流暢且信息豐富的。最後,配備了攻擊評論生成器 (ARG) 的 RBRSs 經過對抗性訓練後能夠增強對惡意評論的抵抗性。


    Review-Based Recommender Systems (RBRS) have attracted increasing research interest due to their ability to alleviate well-known cold-start problems. RBRS utilizes reviews to construct the user and items representations. However, in this thesis, we argue that such a reliance on reviews may instead expose systems to the risk of being shilled. To explore this possibility, in this thesis, we propose the first generation-based model for shilling attacks against RBRSs. Specifically, we learn a fake review generator through reinforcement learning, which maliciously promotes items by forcing prediction shifts after adding generated reviews to the system. By introducing the auxiliary rewards to increase text fluency and diversity with the aid of pre-trained language models and aspect predictors, the generated reviews can be effective for shilling with high fidelity. Experimental results demonstrate that the proposed framework can successfully attack three different kinds of RBRSs on the Amazon corpus with three domains and Yelp corpus. Furthermore, human studies also show that the generated reviews are fluent and informative. Finally, equipped with Attack Review Generators (ARGs), RBRSs with adversarial training are much more robust to malicious reviews.

    Abstract i 摘要ii 致謝iii Contents iv List of Figures vi List of Tables vii 1 Introduction 1 2 Related Work 7 2.1 Attack on Recommender System . . . . . . . . . . . . . . . . . . . 7 2.2 Review-based Recommendation System . . . . . . . . . . . . . . . . 8 2.2.1 Text Generation Attacks . . . . . . . . . . . . . . . . . . . . 9 3 Problem Formulation 10 4 Methodology 12 4.1 Attack Review Generator . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Reinforcement Learning for Attack . . . . . . . . . . . . . . . . . . 13 iv 4.2.1 Prediction shift reward . . . . . . . . . . . . . . . . . . . . . 15 4.2.2 Inverse Perplexity reward . . . . . . . . . . . . . . . . . . . 15 4.2.3 Relevance reward . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Aspect Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.4 Training Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5 Experiments 20 5.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 RQ1: Attack Performance . . . . . . . . . . . . . . . . . . . . . . . 25 5.3 RQ2: Quality of Attack Reviews . . . . . . . . . . . . . . . . . . . 26 5.4 RQ3: Human Evaluation and Detector . . . . . . . . . . . . . . . . 30 5.5 RQ4: Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . 33 5.6 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6 Ethical Considerations 37 7 Conclusion 38 References 39 Appendices 46 A My Appendix 46 A.1 Results of ABAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 A.2 Adversarial Training Results on Yelp Dataset . . . . . . . . . . . . 47 A.3 ChatGPT as Fake Review Classifier . . . . . . . . . . . . . . . . . . 47 A.4 Examples of Different Attack Reviews . . . . . . . . . . . . . . . . 48 A.5 Alternative Metrics for Evaluating Review Quality . . . . . . . . . 48

    Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender
    systems: A survey of the state-of-the-art and possible extensions.
    IEEE Trans Knowl Data Eng, 734–749.
    Bražinskas, A., Lapata, M., & Titov, I. (2020). Few-shot learning for opinion
    summarization. EMNLP, 4119–4135.
    Bražinskas, A., Lapata, M., & Titov, I. (2021). Learning opinion summarizers by
    selecting informative reviews. EMNLP, 9424–9442.
    Burke, R., Mobasher, B., Williams, C., & Bhaumik, R. (2006). Classification features
    for attack detection in collaborative recommender systems. KDD,
    542–547.
    Chen, C., Zhang, M., Liu, Y., & Ma, S. (2018). Neural attentional rating regression
    with review-level explanations. WWW , 1583–1592.
    Chen, J., Fan, W., Zhu, G., Zhao, X., Yuan, C., Li, Q., & Huang, Y. (2022).
    Knowledge-enhanced black-box attacks for recommendations. KDD, 108–
    117.
    Chen, X., Du, Y., Xia, L., & Wang, J. (2021). Reinforcement recommendation
    with user multi-aspect preference. WWW , 425–435.
    39
    Cheng, Z., & Hurley, N. (2009). Effective diverse and obfuscated attacks on modelbased
    recommender systems. RecSys, 141–148.
    Cohen, R., Sar Shalom, O., Jannach, D., & Amir, A. (2021). A black-box attack
    model for visually-aware recommender systems. WSDM, 94–102.
    Ding, N., & Soricut, R. (2017). Cold-start reinforcement learning with softmax
    policy gradient. NeurIPS.
    Dognin, P., Padhi, I., Melnyk, I., & Das, P. (2021). ReGen: Reinforcement learning
    for text and knowledge base generation using pretrained language models.
    EMNLP, 1084–1099.
    Dong, X., Ni, J., Cheng, W., Chen, Z., Zong, B., Song, D., Liu, Y., Chen, H.,
    & de Melo, G. (2020). Asymmetrical hierarchical networks with attentive
    interactions for interpretable review-based recommendation. AAAI, 7667–
    7674.
    Dong, Y., Fu, Q.-A., Yang, X., Pang, T., Su, H., Xiao, Z., & Zhu, J. (2020).
    Benchmarking adversarial robustness on image classification. CVPR.
    Ebrahimi, J., Rao, A., Lowd, D., & Dou, D. (2018). HotFlip: White-box adversarial
    examples for text classification. ACL, 31–36.
    Fang, M., Gong, N. Z., & Liu, J. (2020). Influence function based data poisoning
    attacks to top-n recommender systems. WWW , 3019–3025.
    Fang, M., Yang, G., Gong, N. Z., & Liu, J. (2018). Poisoning attacks to graphbased
    recommender systems. ACSAC, 381–392.
    Gao, J., Lin, Y., Wang, Y., Wang, X., Yang, Z., He, Y., & Chu, X. (2020). Setsequence-
    graph: A multi-view approach towards exploiting reviews for recommendation.
    CIKM, 395–404.
    40
    Gunes, I., Kaleli, C., Bilge, A., & Polat, H. (2014). Shilling attacks against recommender
    systems: A comprehensive survey. Artif. Intell. Rev., 42, 767–
    799.
    Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. J., & Dragan, A. (2017).
    NeurIPS, 30.
    He, B., Ahamad, M., & Kumar, S. (2021). Petgen: Personalized text generation
    attack on deep sequence embedding-based classification models. KDD, 575–
    584.
    He, R., Lee, W. S., Ng, H. T., & Dahlmeier, D. (2017). An unsupervised neural
    attention model for aspect extraction. ACL, 388–397.
    Huang, H., Mu, J., Gong, N. Z., Li, Q., Liu, B., & Xu, M. (2021). Data poisoning
    attacks to deep learning based recommender systems. NDSS.
    Jnoub, N., Brankovic, A., & Klas, W. (2021). Fact-checking reasoning system for
    fake review detection using answer set programming. Algorithms, 14, 190.
    Kaur, P., & Goel, S. (2016). Shilling attack models in recommender system.
    ICICT, 2, 1–5.
    Lam, S. K., & Riedl, J. (2004). Shilling recommender systems for fun and profit.
    WWW , 393–402.
    Le, T., Wang, S., & Lee, D. (n.d.). Malcom: Generating malicious comments to
    attack neural fake news detection models. ICDM, 282–291.
    Leino, J., & Räihä, K.-J. (2007). Case amazon: Ratings and reviews as part of
    recommendations. RecSys, 137–140.
    Li, B., Wang, Y., Singh, A., & Vorobeychik, Y. (2016). Data poisoning attacks on
    factorization-based collaborative filtering. NeurIPS, 29.
    41
    Li, J., Ji, S., Du, T., Li, B., & Wang, T. (2019). Textbugger: Generating adversarial
    text against real-world applications. NDSS.
    Lin, C., Chen, S., Li, H., Xiao, Y., Li, L., & Yang, Q. (2020). Attacking recommender
    systems with augmented user profiles. CIKM, 855–864.
    Lin, C., Chen, S., Zeng, M., Zhang, S., Gao, M., & Li, H. (2022). Shilling blackbox
    recommender systems by learning to generate fake user profiles. IEEE
    Trans. Neural Netw. Learn. Syst., 1–15.
    Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries.
    Text Summarization Branches Out, 74–81.
    Liu, D., Li, J., Du, B., Chang, J., & Gao, R. (2019). Daml: Dual attention mutual
    learning between ratings and reviews for item recommendation. KDD, 344–
    352.
    Liu, H., Wu, F., Wang, W., Wang, X., Jiao, P., Wu, C., & Xie, X. (2019). Nrpa:
    Neural recommendation with personalized attention. SIGIR, 1233–1236.
    Liu, M., Yang, E., Xiong, D., Zhang, Y., Meng, Y., Hu, C., Xu, J., & Chen, Y.
    (2020). A learning-exploring method to generate diverse paraphrases with
    multi-objective deep reinforcement learning. CICLing, 2310–2321.
    Lu, Y., Rai, H., Chang, J., Knyazev, B., Yu, G., Shekhar, S., Taylor, G. W., &
    Volkovs, M. (2021). Context-aware scene graph generation with seq2seq
    transformers. ICCV , 15931–15941.
    Luo, S., Lu, X., Wu, J., & Yuan, J. (2021). Review-aware neural recommendation
    with cross-modality mutual attention. CIKM, 3293–3297.
    42
    Ma, J., Gao, W., & Wong, K.-F. (2019). Detect rumors on twitter by promoting
    information campaigns with generative adversarial learning. WWW , 3049–
    3055.
    Mobasher, B., Burke, R., Bhaumik, R., & Williams, C. (2007). Toward trustworthy
    recommender systems: An analysis of attack models and algorithm
    robustness. ACM Trans. Internet Technol., 23–es.
    Morris, J. X., Lifland, E., Yoo, J. Y., & Qi, Y. (2020). Textattack: A framework
    for adversarial attacks in natural language processing. CoRR.
    Ni, J., & McAuley, J. (2018). Personalized review generation by expanding phrases
    and attending on aspect-aware representations. ACL, 706–711.
    O’Donovan, J., & Smyth, B. (2006). Is trust robust? an analysis of trust-based
    recommendation. IUI, 101–108.
    Pan, A., Bhatia, K., & Steinhardt, J. (2022). The effects of reward misspecification:
    Mapping and mitigating misaligned models. ICLR.
    Pang, M., Gao, W., Tao, M., & Zhou, Z.-H. (2018). Unorganized malicious attacks
    detection. NeurIPS, 31.
    Paul, H., & Nikolaev, A. (2021). Fake review detection on online e-commerce
    platforms: A systematic literature review. Data Min Knowl Discov, 35,
    1830–1881.
    Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019).
    Language models are unsupervised multitask learners. OpenAI blog, 1(8),
    9.
    Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J., & Goel, V. (2017). Self-critical
    sequence training for image captioning. CVPR, 7008–7024.
    43
    Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: Introduction
    and challenges. Recommender systems handbook (pp. 1–34).
    Shuai, J., Zhang, K., Wu, L., Sun, P., Hong, R., Wang, M., & Li, Y. (2022). A
    review-aware graph contrastive learning framework for recommendation.
    SIGIR, 1283–1293.
    Song, J., Li, Z., Hu, Z., Wu, Y., Li, Z., Li, J., & Gao, J. (2020). Poisonrec: An
    adaptive data poisoning framework for attacking black-box recommender
    systems. ICDE, 157–168.
    Sun, P., Wu, L., Zhang, K., Su, Y., & Wang, M. (2021). An unsupervised aspectaware
    recommendation model with explanation text generation. ACM Trans.
    Inf. Syst.
    Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient
    methods for reinforcement learning with function approximation. NeurIPS.
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
    Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. NeurIPS, 30.
    Wallace, E., Feng, S., Kandpal, N., Gardner, M., & Singh, S. (2019). Universal
    adversarial triggers for attacking and analyzing NLP. EMNLP-IJCNLP,
    2153–2162.
    Wang, J., Wen, R., Wu, C., Huang, Y., & Xiong, J. (2019). Fdgars: Fraudster
    detection via graph convolutional networks in online app review system.
    WWW , 310–316.
    Wang, X., Zhang, X., Jiang, C., & Liu, H. (2018). Identification of fake reviews
    using semantic and behavioral features. ICIM, 92–97.
    44
    Wang, Z., Huang, N., Sun, F., Ren, P., Chen, Z., Luo, H., de Rijke, M., & Ren, Z.
    (2022). Debiasing learning for membership inference attacks against recommender
    systems. KDD, 1959–1968.
    Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist
    reinforcement learning. Mach. Learn., 8, 229–256.
    Wu, C., Lian, D., Ge, Y., Zhu, Z., & Chen, E. (2021). Triple adversarial learning
    for influence based poisoning attack in recommender systems. KDD, 1830–
    1840.
    Wu, C., Wu, F., Qi, T., Ge, S., Huang, Y., & Xie, X. (2019). Reviews meet graphs:
    Enhancing user and item representations for recommendation with hierarchical
    attentive graph neural network. EMNLP-IJCNLP, 4884–4893.
    Xu, Y., Wu, B., Shen, F., Fan, Y., Zhang, Y., Shen, H. T., & Liu, W. (2019). Exact
    adversarial attack to image captioning via structured output learning with
    latent variables. CVPR.
    Zhang, H., Tian, C., Li, Y., Su, L., Yang, N., Zhao, W. X., & Gao, J. (2021).
    Data poisoning attack against recommender system using incomplete and
    perturbed data. KDD, 2154–2164.
    Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender
    system: A survey and new perspectives. CSUR, 1–38.
    Zheng, L., Noroozi, V., & Yu, P. S. (2017). Joint deep modeling of users and items
    using reviews for recommendation. WSDM, 425–434.

    QR CODE