研究生: |
唐榆翔 Tang, Yu-Hsiang |
---|---|
論文名稱: |
以文字探勘分析網路霸凌之現象-以厭女風氣為例 Analysis of Misogynic Images by Using Text Mining Technology – A Cyberbullying Case Study |
指導教授: |
區國良
Ou, Kuo-Liang 唐文華 Tarng, Wern-Huar |
口試委員: |
蘇俊銘
Su, Jun-Ming 陳鏗任 Chen, Ken-Zen |
學位類別: |
碩士 Master |
系所名稱: |
竹師教育學院 - 學習科學與科技研究所 Institute of Learning Sciences and Technologies |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 101 |
中文關鍵詞: | 網路霸凌 、機器學習 、文字探勘 、自然語言處理 、女性貶抑 |
外文關鍵詞: | Cyberbullying, Machine Learning, Text Mining, Natural Language Processing, Misogyny |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
因為網路匿名化的特性,越來越多霸凌行為容易透過網際網路作為傳遞的媒介,在網際網路上惡意地傳播著,由於其造成的效果會對社會帶來許多改變,許多研究開始探討網路霸凌的成因及特徵,以提供閱讀者以及網站經營者及早過濾不當的霸凌文章使用。
過去網路霸凌的言論多半圍繞著宗教、政治,隨著近年來女性主義的興起,越來越多關於仇恨女性的網路霸凌言語,也開始頻繁地出現在網路上,PTT 論壇甚至在 2015年時出現母豬教等詞彙,以仇恨女性當作信仰,在網路上針對女性進行無差別攻擊,故本文想了解網路霸凌中,以仇恨女性為主體的厭女風氣在網路上的特徵以及成因。與傳統的質性研究方法不同,本論文中將使用文字探勘的方式,能夠在短時間內,透過將爬蟲自動讀取台灣網路論壇 PTT 大量的語料進行特徵分析,並以分群等演算法將語料進行主題識別。接著以序列分析法尋找不同時段厭女之網路輿論與網路霸凌等關聯,最後透過機器學習演算法,對比不同的參數設置來提升演算法對於文字辨別的程度,並輔以文字雲將不同時段的網路留言視覺化,以此比較 PTT 不同看版之間針對厭女風氣成因及特徵之探討。
Because of the anonymity of the Internet, more and more bullying behaviors are spreading maliciously through the Internet as a medium of transmission. Due to its effects will have a negative impact on society, more and more studies have begun to discuss the causes and characteristics of cyberbullying, so as to provide readers and website operators with early filtering of inappropriate bullying articles for use.
In the past, most cyberbullying remarks revolved around religion and politics. With the rise of feminism in recent years, more and more cyberbullying comments about misogyny have also begun to appear frequently on the Internet. In 2015, PTT forum appeared with a word called sow religion, it regarded the hatred of women as a belief, and indiscriminate attacks against women were carried out on the Internet. Therefore, this article wants to understand cyberbullying and misogyny.
In this paper, the text mining method will be used to analyze the characteristics by automatically reading a large amount of the corpus of Taiwan Internet Forum PTT, and to identify the subject of the corpus with algorithms such as clustering. Then, the sequence analysis method is used to find the relationship between online public opinion of misogyny and cyberbullying at different times. Finally, the machine learning algorithm is used to compare different parameter settings to improve the algorithm's degree of text recognition. In order to compare the causes and characteristics of misogyny between different versions of PTT, and different time periods messages of visualization will be presented as a word cloud.
中文部分:
10程式中, i. (n.d.). [Day 14] 多棵決策樹更厲害:隨機森林 (Random forest). iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天. Retrieved September 15, 2022, from https://ithelp.ithome.com.tw/articles/10272586
隨機森林(Random forest,RF)的生成方法以及優缺點 - 程式人生. (n.d.). Retrieved September 15, 2022, from https://www.796t.com/content/1547100921.html
中時新聞網. (2020, April 22). 楊又穎過世5年 親哥痛揭事發前晚「她笑著說出求救訊號」. Retrieved September 15, 2022, from https://www.chinatimes.com/realtimenews/20200422003684-260404?chdtv
英文部分:
Tileagă, C.. (2019). Communicating misogyny: An interdisciplinary research agenda
for social psychology. Social and Personality Psychology Compass, 13(7).
https://doi.org/10.1111/spc3.12491
Al-Garadi, M. A., Hussain, M. R., Khan, N., Murtaza, G., Nweke, H. F., Ali, I.,
Mujtaba, G., Chiroma, H., Khattak, H. A., & Gani, A.. (2019). Predicting
Cyberbullying on Social Media in the Big Data Era Using Machine Learning
Algorithms: Review of Literature and Open Challenges. IEEE Access, 7, 70701–
70718. https://doi.org/10.1109/access.2019.2918354
Singh, N., & Sharma, S. K.. (2021). Review of Machine Learning methods for Identification of
Cyberbullying in Social Media. https://doi.org/10.1109/icais50930.2021.9395797
Huang, Y.-Y., & Chou, C.. (2010). An analysis of multiple factors of cyberbullying among junior
high school students in Taiwan. Computers in Human Behavior, 26(6), 1581–1590.
https://doi.org/10.1016/j.chb.2010.06.005
Feldman, R.. (2013). Techniques and applications for sentiment analysis. Communications of the
ACM, 56(4), 82–89. https://doi.org/10.1145/2436256.2436274
Zhang, C., Wang, X., Yu, S., & Wang, Y.. (2018). Research on Keyword Extraction of Word2vec
Model in Chinese Corpus. https://doi.org/10.1109/icis.2018.8466534
Safali, Y., Nergiz, G., Avaroglu, E., & Dogan, E.. (2019). Deep Learning Based Classification
Using Academic Studies in Doc2Vec Model. https://doi.org/10.1109/idap.2019.8875877
Di Capua, M., Di Nardo, E., & Petrosino, A.. (2016). Unsupervised cyber bullying detection in
social networks. https://doi.org/10.1109/icpr.2016.7899672
Nahar, V., Unankard, S., Li, X., & Pang, C.. (2012). Sentiment Analysis for Effective Detection
of Cyber Bullying. In Lecture Notes in Computer Science (pp. 767–774). Lecture Notes in
Computer Science. https://doi.org/10.1007/978-3-642-29253-8_75
Altay, E. V., & Alatas, B.. (2018). Detection of Cyberbullying in Social Networks Using Machine
Learning Methods. https://doi.org/10.1109/ibigdelft.2018.8625321
Zhao, R., Zhou, A., & Mao, K.. (2016). Automatic detection of cyberbullying on social networks
based on bullying features. https://doi.org/10.1145/2833312.2849567
Bin Abdur Rakib, T., & Soon, L.-K.. (2018). Using the Reddit Corpus for Cyberbully Detection.
In Lecture Notes in Computer Science (pp. 180–189). Lecture Notes in Computer Science.
https://doi.org/10.1007/978-3-319-75417-8_17
Smith, P. K.. (2016). Bullying: Definition, Types, Causes, Consequences and Intervention. Social
and Personality Psychology Compass, 10(9), 519–532. https://doi.org/10.1111/spc3.12266
Wu, M.-J., Fu, T.-Y., Chang, Y.-C., & Lee, C.-W.. (2020). A Study on Natural Language
Processing Classified News. https://doi.org/10.1109/indo-taiwanican48429.2020.9181355
Castillo, C.. (2005). Effective web crawling. ACM SIGIR Forum, 39(1), 55–56.
https://doi.org/10.1145/1067268.1067287
Olweus, D., Limber, S. P., & Breivik, K.. (2019). Addressing Specific Forms of Bullying: A
Large-Scale Evaluation of the Olweus Bullying Prevention Program. International Journal
of Bullying Prevention, 1(1), 70–84. https://doi.org/10.1007/s42380-019-00009-7
Schoffstall, C. L., & Cohen, R.. (2011). Cyber Aggression: The Relation between Online
Offenders and Offline Social Competence. Social Development, 20(3), 587–604.
https://doi.org/10.1111/j.1467-9507.2011.00609.x
Deschamps, R., & Mcnutt, K.. (2016). Cyberbullying: What's the problem?.
Canadian Public Administration, 59(1), 45–71. https://doi.org/10.1111/capa.12159
Campbell, M. A.. (2005). Cyber Bullying: An Old Problem in a New Guise?. Australian Journal
of Guidance and Counselling, 15(1), 68–76. https://doi.org/10.1375/ajgc.15.1.68
Upadhyay, A., Chaudhari, A., Arunesh, Ghale, S., & Pawar, S. S.. (2017). Detection and
prevention measures for cyberbullying and online grooming.
https://doi.org/10.1109/icisc.2017.8068605
Andleeb, S., Ahmed, R., Ahmed, Z., & Kanwal, M.. (2019). Identification and Classification of
Cybercrimes using Text Mining Technique. https://doi.org/10.1109/fit47737.2019.00050
Noviantho, Isa, S. M., & Ashianti, L.. (2017). Cyberbullying classification using text mining.
https://doi.org/10.1109/icicos.2017.8276369
Haidar, B., Chamoun, M., & Serhrouchni, A.. (2017). Multilingual cyberbullying detection
system: Detecting cyberbullying in Arabic content.
https://doi.org/10.1109/csnet.2017.8242005
Cevallos, D. (2014, June 18). What's wrong with outlawing bullying?
CNN. https://edition.cnn.com/2014/06/18/opinion/cevallos-bullying-law/index.html
(n.d.). Olweus Bullying Prevention Program, Clemson
University.
https://olweus.sites.clemson.edu/documents/Why%20the%20OBPP%20Works.pdf
Alana James. (2010, February 1). (PDF) School bullying.
ResearchGate. https://www.researchgate.net/publication/264166903_School_bullying
Kao, A., & Poteet, S. R. (2007). Natural language processing and text mining. Springer
Science & Business Media.
Latent Dirichlet allocation. (n.d.).ResearchGate.
https://www.researchgate.net/publication/220319974_Latent_Dirichlet_Allocation
Web-crawling reliability. (2004, December 1). ACM Digital Library.
https://dl.acm.org/doi/10.1002/asi.20078
A web crawler design for data mining - Mike Thelwall, 2001. (2016, July 1). SAGE Journals.
https://journals.sagepub.com/doi/abs/10.1177/01655515010270050
Web crawler and web crawler algorithms: A perspective. (2020). International Journal of
Engineering and Advanced Technology, 9(5),203205.
https://doi.org/10.35940/ijeat.e9362.069520
Lan, T., & Jingxia, L. (2019). On the gender discrimination in English. Advances in Language
and Literary Studies, 10(3), 155. https://doi.org/10.7575/aiac.alls.v.10n.3p.155
Wu, M.-J., Fu, T.-Y., Chang, Y.-C., & Lee, C.-W.. (2020). A Study on Natural Language
Processing Classified News. https://doi.org/10.1109/indo-taiwanican48429.2020.91
Karani, D. (2020, September 2). Introduction to word embedding and Word2Vec.
Medium.
https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec- 652d0c2060fa
Jay, T.. (2009). The Utility and Ubiquity of Taboo Words. Perspectives on Psychological
Science, 4(2), 153–161. https://doi.org/10.1111/j.1745-6924.2009.01115.x
Journal of Language and Social Psychology 3(1):59-74 DOI:10.1177/0261927X8431004
Wikipedia contributors. (2022, August 31). K-nearest neighbors algorithm. Retrieved September.
15, 2022, from https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
Wikipedia contributors. (2022a, August 22). Linear discriminant analysis. Wikipedia. Retrieved
September 15, 2022, from https://en.wikipedia.org/wiki/Linear_discriminant_analysis
Wikipedia contributors. (2021, December 3). Confusion matrix. Wikipedia. Retrieved September
15, 2022, from https://en.wikipedia.org/w/index.php?title=Confusion_matrix&oldid=1058352752
Morde, V. (2021, December 9). XGBoost Algorithm: Long May She Reign! - Towards Data
Science. Medium. Retrieved September 15, 2022, from https://towardsdatascience.com/
https-medium-com-vishalmorde-xgboost-algorithm-long-she-may-rein-edd9f99be63d
Moloney, M. E., & Love, T. P.. (2018). Assessing online misogyny: Perspectives from sociology
and feminist media studies. Sociology Compass, 12(5), e12577.
https://doi.org/10.1111/soc4.12577
Ademiluyi, A., Li, C., & Park, A.. (2022). Implications and Preventions of Cyberbullying and
Social Exclusion in Social Media: Systematic Review. JMIR Formative Research, 6(1),
e30286. https://doi.org/10.2196/30286
Wikipedia Foundation. (2022, September 21). Spiral of silence. Wikipedia.
Retrieved October 24, 2022, from https://en.wikipedia.org/wiki/Spiral_of_silence
Ging, D., & Siapera, E.. (2018). Special issue on online misogyny. Feminist Media Studies, 18(4), 515–524. https://doi.org/10.1080/14680777.2018.1447345