研究生: |
芭南米 Balani, Namrita |
---|---|
論文名稱: |
基於語言描述暗示的仇恨性言論辨識 Descriptive Linguistic Cues for Hate Speech Identification |
指導教授: |
陳宜欣
Chen, Yi-Shin |
口試委員: |
陳朝欽
Chen, Chaur-Chin 韓永楷 Hon, Wing-Kai |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | 仇恨言論 |
外文關鍵詞: | linguistic cues |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著世界科技發展,人們透過線上社群媒體交流快速成長。然而隨之而來的仇恨性言論已漸漸成為一道不可忽視的議題。在這項研究中,我們嘗試針對不同語料、片語、俚語等冒犯性言論進行研究以找出描述性的語言模式以辨識出推特上的仇恨言論。然而,研究仇恨性言論並不容易,仇恨性言論可能會隨著時間、季節而改變;但用於仇恨性言論的描述模式卻會逐漸趨於穩定。在我們的研究成果中可以發現:儘管仇恨性言論會隨著語言、用字而改變,但基本核心概念卻不會。有鑑於此,我們提出一項新架構能根據推特上使用者內文的語言模式來預測出潛在的仇恨性言論。
Amidst the rise of technology, the usage of communicating via social media platforms has grown exponentially. With more users working and communicating solely online, ensuring that hate speech is being flagged properly should be of greatest importance. In this thesis, we attempt to identify descriptive linguistic patterns that can detect the presence of hate speech in a tweet. Researches in the hate speech domain currently focus on identifying different hate speech dictionaries, code words, slangs, and offensive language that are associated with hate. However, keeping up with new words that are used to portray hate is a daunting task. Hate speech words can be rather seasonal and change over time, but the descriptive patterns commonly used with hate speech are more resilient. We present an innovative approach in displaying that although hate words change overtime, the intensity patterns tend to remain the same. We propose a framework that allows the prediction of these cases by identifying linguistic patterns, on user-generated content of Twitter, that lead to hate speech.
[1] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enrichingword vectors with subword information.arXiv preprint arXiv:1607.04606, 2016.
[2] Pete Burnap and Matthew L. Williams. Cyber hate speech on twitter: An applicationof machine classification and statistical modeling for policy and decision making.Policy & Internet, 7(2):223–242, 2015.
[3] Y. Chen, Y. Zhou, S. Zhu, and H. Xu. Detecting offensive language in social mediato protect adolescent online safety. In2012 International Conference on Privacy,Security, Risk and Trust and 2012 International Confernece on Social Computing,pages 71–80, 2012.
[4] Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automatedhate speech detection and the problem of offensive language. 03 2017.
[5] Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Yang Wang, and ElizabethBelding. Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in SocialMedia.arXiv e-prints, page arXiv:1804.04257, April 2018.
[6] Antigoni-Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, and Nico-las Kourtellis. Large scale crowdsourcing and characterization of twitter abusive be-havior. In11th International Conference on Web and Social Media, ICWSM 2018.AAAI Press, 2018.
[7] Yashar Mehdad and Joel Tetreault. Do characters abuse more than words? pages299–303, 01 2016.
[8] Vinita Nahar, Sayan Unankard, Xue Li, and Chaoyi Pang. Sentiment analysis foreffective detection of cyber bullying. In Quan Z. Sheng, Guoren Wang, Christian S.Jensen, and Guandong Xu, editors,Web Technologies and Applications, pages 767–774, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
[9] Dennis Njagi, Z. Zuping, Damien Hanyurwimfura, and Jun Long. A lexicon-basedapproach for hate speech detection.International Journal of Multimedia and Ubiqui-tous Engineering, 10:215–230, 04 2015.
[10] Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang.Abusive language detection in online user content. pages 145–153, 04 2016.
[11] Chaoyi Pang. An effective approach for cyberbullying detection.Communications inInformation Science and Management Engineering, 3, 01 2013.[12] Elvis Saravia, Carlos Argueta, and Yi-Shin Chen. Unsupervised graph-based pat-tern extraction for multilingual emotion classification.Social Network Analysis andMining, 6, 12 2016.
[12] Elvis Saravia, Carlos Argueta, and Yi-Shin Chen. Unsupervised graph-based pat-tern extraction for multilingual emotion classification.Social Network Analysis andMining, 6, 12 2016.
13] Anna Schmidt and Michael Wiegand. A survey on hate speech detection using naturallanguage processing. InProceedings of the Fifth International Workshop on NaturalLanguage Processing for Social Media, April 2017.
[14] Mark Senn.Using LATEX for Your Thesis, 2009 (accessed February 3, 2014).
[15] Leandro Silva, Mainack Mondal, Denzil Correa, Fabricio Benevenuto, and IngmarWeber. Analyzing the Targets of Hate in Online Social Media.arXiv e-prints, pagearXiv:1603.07709, March 2016.
[16] Jherez Taylor, Melvyn Peignon, and Yi-Shin Chen. Surfacing contextual hate speechwords within social media. 11 2017.
[17] Zeerak Waseem and Dirk Hovy. Hateful symbols or hateful people? predictive fea-tures for hate speech detection on twitter. InProceedings of the NAACL StudentResearch Workshop, pages 88–93, San Diego, California, June 2016. Association forComputational Linguistics.
[18] Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. Learning frombullying traces in social media. InProceedings of the 2012 Conference of the NorthAmerican Chapter of the Association for Computational Linguistics: Human Lan-guage Technologies, NAACL HLT’12, page 656–666, USA, 2012. Associationfor Computational Linguistics.
[19] Dawei Yin, Brian D. Davison, Zhenzhen Xue, Liangjie Hong, April Kontostathis, andLynne Edwards. Detection of harassment on web 2.0, 2009.