簡易檢索 / 詳目顯示

研究生: 唐榆翔
Tang, Yu-Hsiang
論文名稱: 以文字探勘分析網路霸凌之現象-以厭女風氣為例
Analysis of Misogynic Images by Using Text Mining Technology – A Cyberbullying Case Study
指導教授: 區國良
Ou, Kuo-Liang
唐文華
Tarng, Wern-Huar
口試委員: 蘇俊銘
Su, Jun-Ming
陳鏗任
Chen, Ken-Zen
學位類別: 碩士
Master
系所名稱: 竹師教育學院 - 學習科學與科技研究所
Institute of Learning Sciences and Technologies
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 101
中文關鍵詞: 網路霸凌機器學習文字探勘自然語言處理女性貶抑
外文關鍵詞: Cyberbullying, Machine Learning, Text Mining, Natural Language Processing, Misogyny
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 因為網路匿名化的特性,越來越多霸凌行為容易透過網際網路作為傳遞的媒介,在網際網路上惡意地傳播著,由於其造成的效果會對社會帶來許多改變,許多研究開始探討網路霸凌的成因及特徵,以提供閱讀者以及網站經營者及早過濾不當的霸凌文章使用。

    過去網路霸凌的言論多半圍繞著宗教、政治,隨著近年來女性主義的興起,越來越多關於仇恨女性的網路霸凌言語,也開始頻繁地出現在網路上,PTT 論壇甚至在 2015年時出現母豬教等詞彙,以仇恨女性當作信仰,在網路上針對女性進行無差別攻擊,故本文想了解網路霸凌中,以仇恨女性為主體的厭女風氣在網路上的特徵以及成因。與傳統的質性研究方法不同,本論文中將使用文字探勘的方式,能夠在短時間內,透過將爬蟲自動讀取台灣網路論壇 PTT 大量的語料進行特徵分析,並以分群等演算法將語料進行主題識別。接著以序列分析法尋找不同時段厭女之網路輿論與網路霸凌等關聯,最後透過機器學習演算法,對比不同的參數設置來提升演算法對於文字辨別的程度,並輔以文字雲將不同時段的網路留言視覺化,以此比較 PTT 不同看版之間針對厭女風氣成因及特徵之探討。


    Because of the anonymity of the Internet, more and more bullying behaviors are spreading maliciously through the Internet as a medium of transmission. Due to its effects will have a negative impact on society, more and more studies have begun to discuss the causes and characteristics of cyberbullying, so as to provide readers and website operators with early filtering of inappropriate bullying articles for use.

    In the past, most cyberbullying remarks revolved around religion and politics. With the rise of feminism in recent years, more and more cyberbullying comments about misogyny have also begun to appear frequently on the Internet. In 2015, PTT forum appeared with a word called sow religion, it regarded the hatred of women as a belief, and indiscriminate attacks against women were carried out on the Internet. Therefore, this article wants to understand cyberbullying and misogyny.

    In this paper, the text mining method will be used to analyze the characteristics by automatically reading a large amount of the corpus of Taiwan Internet Forum PTT, and to identify the subject of the corpus with algorithms such as clustering. Then, the sequence analysis method is used to find the relationship between online public opinion of misogyny and cyberbullying at different times. Finally, the machine learning algorithm is used to compare different parameter settings to improve the algorithm's degree of text recognition. In order to compare the causes and characteristics of misogyny between different versions of PTT, and different time periods messages of visualization will be presented as a word cloud.

    論文目次 摘要 i Abstract ii 目錄 iii 附表目次 iv 第一章緒論 1 第一節 研究背景與動機 1 第二節 研究目的 4 第二章文獻探討 5 第一節 霸凌與網路霸凌的界定與特徵 5 第二節 文字探勘 10 第三節 女性貶抑 20 第三章研究方法 23 第一節 研究架構 23 第四章 研究結果 34 第一節 資料集的分類 34 第二節 霸凌偵測實驗結果 48 第三節 資料集的分析 73 第五章 研究結論與建議 75 第一節 研究結論 76 第二節 研究限制與建議 78 參考文獻 80 附錄一 厭女辭典種子字 86 附錄二 詞嵌入厭女詞彙擴增表 87 附錄三 性別霸凌仇女詞彙的細目分類-字詞統計 100

    中文部分:

    10程式中, i. (n.d.). [Day 14] 多棵決策樹更厲害:隨機森林 (Random forest). iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天. Retrieved September 15, 2022, from https://ithelp.ithome.com.tw/articles/10272586

    隨機森林(Random forest,RF)的生成方法以及優缺點 - 程式人生. (n.d.). Retrieved September 15, 2022, from https://www.796t.com/content/1547100921.html

    中時新聞網. (2020, April 22). 楊又穎過世5年 親哥痛揭事發前晚「她笑著說出求救訊號」. Retrieved September 15, 2022, from https://www.chinatimes.com/realtimenews/20200422003684-260404?chdtv

    英文部分:

    Tileagă, C.. (2019). Communicating misogyny: An interdisciplinary research agenda
    for social psychology. Social and Personality Psychology Compass, 13(7).
    https://doi.org/10.1111/spc3.12491

    Al-Garadi, M. A., Hussain, M. R., Khan, N., Murtaza, G., Nweke, H. F., Ali, I.,
    Mujtaba, G., Chiroma, H., Khattak, H. A., & Gani, A.. (2019). Predicting
    Cyberbullying on Social Media in the Big Data Era Using Machine Learning
    Algorithms: Review of Literature and Open Challenges. IEEE Access, 7, 70701–
    70718. https://doi.org/10.1109/access.2019.2918354

    Singh, N., & Sharma, S. K.. (2021). Review of Machine Learning methods for Identification of
    Cyberbullying in Social Media. https://doi.org/10.1109/icais50930.2021.9395797

    Huang, Y.-Y., & Chou, C.. (2010). An analysis of multiple factors of cyberbullying among junior
    high school students in Taiwan. Computers in Human Behavior, 26(6), 1581–1590.
    https://doi.org/10.1016/j.chb.2010.06.005

    Feldman, R.. (2013). Techniques and applications for sentiment analysis. Communications of the
    ACM, 56(4), 82–89. https://doi.org/10.1145/2436256.2436274

    Zhang, C., Wang, X., Yu, S., & Wang, Y.. (2018). Research on Keyword Extraction of Word2vec
    Model in Chinese Corpus. https://doi.org/10.1109/icis.2018.8466534

    Safali, Y., Nergiz, G., Avaroglu, E., & Dogan, E.. (2019). Deep Learning Based Classification
    Using Academic Studies in Doc2Vec Model. https://doi.org/10.1109/idap.2019.8875877

    Di Capua, M., Di Nardo, E., & Petrosino, A.. (2016). Unsupervised cyber bullying detection in
    social networks. https://doi.org/10.1109/icpr.2016.7899672

    Nahar, V., Unankard, S., Li, X., & Pang, C.. (2012). Sentiment Analysis for Effective Detection
    of Cyber Bullying. In Lecture Notes in Computer Science (pp. 767–774). Lecture Notes in
    Computer Science. https://doi.org/10.1007/978-3-642-29253-8_75

    Altay, E. V., & Alatas, B.. (2018). Detection of Cyberbullying in Social Networks Using Machine
    Learning Methods. https://doi.org/10.1109/ibigdelft.2018.8625321

    Zhao, R., Zhou, A., & Mao, K.. (2016). Automatic detection of cyberbullying on social networks
    based on bullying features. https://doi.org/10.1145/2833312.2849567

    Bin Abdur Rakib, T., & Soon, L.-K.. (2018). Using the Reddit Corpus for Cyberbully Detection.
    In Lecture Notes in Computer Science (pp. 180–189). Lecture Notes in Computer Science.
    https://doi.org/10.1007/978-3-319-75417-8_17

    Smith, P. K.. (2016). Bullying: Definition, Types, Causes, Consequences and Intervention. Social
    and Personality Psychology Compass, 10(9), 519–532. https://doi.org/10.1111/spc3.12266

    Wu, M.-J., Fu, T.-Y., Chang, Y.-C., & Lee, C.-W.. (2020). A Study on Natural Language
    Processing Classified News. https://doi.org/10.1109/indo-taiwanican48429.2020.9181355

    Castillo, C.. (2005). Effective web crawling. ACM SIGIR Forum, 39(1), 55–56.
    https://doi.org/10.1145/1067268.1067287

    Olweus, D., Limber, S. P., & Breivik, K.. (2019). Addressing Specific Forms of Bullying: A
    Large-Scale Evaluation of the Olweus Bullying Prevention Program. International Journal
    of Bullying Prevention, 1(1), 70–84. https://doi.org/10.1007/s42380-019-00009-7

    Schoffstall, C. L., & Cohen, R.. (2011). Cyber Aggression: The Relation between Online
    Offenders and Offline Social Competence. Social Development, 20(3), 587–604.
    https://doi.org/10.1111/j.1467-9507.2011.00609.x

    Deschamps, R., & Mcnutt, K.. (2016). Cyberbullying: What's the problem?.
    Canadian Public Administration, 59(1), 45–71. https://doi.org/10.1111/capa.12159

    Campbell, M. A.. (2005). Cyber Bullying: An Old Problem in a New Guise?. Australian Journal
    of Guidance and Counselling, 15(1), 68–76. https://doi.org/10.1375/ajgc.15.1.68

    Upadhyay, A., Chaudhari, A., Arunesh, Ghale, S., & Pawar, S. S.. (2017). Detection and
    prevention measures for cyberbullying and online grooming.
    https://doi.org/10.1109/icisc.2017.8068605

    Andleeb, S., Ahmed, R., Ahmed, Z., & Kanwal, M.. (2019). Identification and Classification of
    Cybercrimes using Text Mining Technique. https://doi.org/10.1109/fit47737.2019.00050

    Noviantho, Isa, S. M., & Ashianti, L.. (2017). Cyberbullying classification using text mining.
    https://doi.org/10.1109/icicos.2017.8276369

    Haidar, B., Chamoun, M., & Serhrouchni, A.. (2017). Multilingual cyberbullying detection
    system: Detecting cyberbullying in Arabic content.
    https://doi.org/10.1109/csnet.2017.8242005

    Cevallos, D. (2014, June 18). What's wrong with outlawing bullying?
    CNN. https://edition.cnn.com/2014/06/18/opinion/cevallos-bullying-law/index.html

    (n.d.). Olweus Bullying Prevention Program, Clemson
    University.
    https://olweus.sites.clemson.edu/documents/Why%20the%20OBPP%20Works.pdf

    Alana James. (2010, February 1). (PDF) School bullying.
    ResearchGate. https://www.researchgate.net/publication/264166903_School_bullying

    Kao, A., & Poteet, S. R. (2007). Natural language processing and text mining. Springer
    Science & Business Media.

    Latent Dirichlet allocation. (n.d.).ResearchGate.
    https://www.researchgate.net/publication/220319974_Latent_Dirichlet_Allocation

    Web-crawling reliability. (2004, December 1). ACM Digital Library.
    https://dl.acm.org/doi/10.1002/asi.20078

    A web crawler design for data mining - Mike Thelwall, 2001. (2016, July 1). SAGE Journals.
    https://journals.sagepub.com/doi/abs/10.1177/01655515010270050

    Web crawler and web crawler algorithms: A perspective. (2020). International Journal of
    Engineering and Advanced Technology, 9(5),203205.
    https://doi.org/10.35940/ijeat.e9362.069520

    Lan, T., & Jingxia, L. (2019). On the gender discrimination in English. Advances in Language
    and Literary Studies, 10(3), 155. https://doi.org/10.7575/aiac.alls.v.10n.3p.155

    Wu, M.-J., Fu, T.-Y., Chang, Y.-C., & Lee, C.-W.. (2020). A Study on Natural Language
    Processing Classified News. https://doi.org/10.1109/indo-taiwanican48429.2020.91

    Karani, D. (2020, September 2). Introduction to word embedding and Word2Vec.
    Medium.
    https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec- 652d0c2060fa

    Jay, T.. (2009). The Utility and Ubiquity of Taboo Words. Perspectives on Psychological
    Science, 4(2), 153–161. https://doi.org/10.1111/j.1745-6924.2009.01115.x
    Journal of Language and Social Psychology 3(1):59-74 DOI:10.1177/0261927X8431004

    Wikipedia contributors. (2022, August 31). K-nearest neighbors algorithm. Retrieved September.
    15, 2022, from https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

    Wikipedia contributors. (2022a, August 22). Linear discriminant analysis. Wikipedia. Retrieved
    September 15, 2022, from https://en.wikipedia.org/wiki/Linear_discriminant_analysis

    Wikipedia contributors. (2021, December 3). Confusion matrix. Wikipedia. Retrieved September
    15, 2022, from https://en.wikipedia.org/w/index.php?title=Confusion_matrix&oldid=1058352752

    Morde, V. (2021, December 9). XGBoost Algorithm: Long May She Reign! - Towards Data
    Science. Medium. Retrieved September 15, 2022, from https://towardsdatascience.com/
    https-medium-com-vishalmorde-xgboost-algorithm-long-she-may-rein-edd9f99be63d

    Moloney, M. E., & Love, T. P.. (2018). Assessing online misogyny: Perspectives from sociology
    and feminist media studies. Sociology Compass, 12(5), e12577.
    https://doi.org/10.1111/soc4.12577

    Ademiluyi, A., Li, C., & Park, A.. (2022). Implications and Preventions of Cyberbullying and
    Social Exclusion in Social Media: Systematic Review. JMIR Formative Research, 6(1),
    e30286. https://doi.org/10.2196/30286

    Wikipedia Foundation. (2022, September 21). Spiral of silence. Wikipedia.
    Retrieved October 24, 2022, from https://en.wikipedia.org/wiki/Spiral_of_silence

    Ging, D., & Siapera, E.. (2018). Special issue on online misogyny. Feminist Media Studies, 18(4), 515–524. https://doi.org/10.1080/14680777.2018.1447345

    QR CODE