簡易檢索 / 詳目顯示

研究生: 左之望
Tso, Chih-Wang
論文名稱: 運用卷積神經網路建構性騷擾語音辨識系統
Sexual Harassment Speech Recognition System using Convolutional Neural Networks
指導教授: 蘇朝墩
Su, Chao-Ton
口試委員: 薛友仁
Shiue, Yeou-Ren
許俊欽
Hsu, Chun-Chin
蕭宇翔
Hsiao, Yu-Hsiang
學位類別: 碩士
Master
系所名稱: 工學院 - 工業工程與工程管理學系
Department of Industrial Engineering and Engineering Management
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 56
中文關鍵詞: 性騷擾語音辨識RAVDESS特徵提取資料擴增卷積神經網路
外文關鍵詞: sexual harassment, speech recognition, RAVDESS, feature extraction, data augmentation, convolutional neural network (CNN)
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,防治性騷擾與性侵害的議題逐漸受到社會關注,而在各類性騷擾中,以公共場所的性騷擾占比最高。目前僅有少數研究針對性騷擾當下的防治,且尚未有使用語音辨識判定性騷擾之研究,本研究旨在協助受害者面臨性騷擾當下,能夠更肯定自身遭遇的情況即時反應。本研究使用RAVDESS語音資料集,聘請擁有性騷擾相關知識的性別平等會委員們協助標註音頻資料,透過聆聽語音並衡量是否有性騷擾疑慮給予不同評分,萃取語音特徵後使用SpecAugment資料擴增平衡資料樣本數,透過卷積神經網路(CNN)訓練,最終在二維卷積神經網路之辨識準確度達到90.32%。此成果說明本研究所建立的模型,可有效判別是否為性騷擾之語音音頻。


    In recent years, the issue of preventing sexual harassment and sexual assault has gradually aroused social concern. Among all kinds of sexual harassment, the public sexual harassment is the most common one. However, there are only a few studies on the present prevention of sexual harassment nowadays, and there is no research which use speech recognition to detect sexual harassment. In this research, we recruit 4 experts who have the domain knowledge of sexual harassment to classify RAVDESS dataset. They listen the voice recording and feel whether there is a sexual harassment concern or not, then give each voice recording a score. After classifying, we extract the feature of each recording and use SpecAugment to balance the data. We use convolutional neural network (CNN) to train models and the performance of two-dimensional convolutional neural network reaches 90.32%. The result shows that they can effectively detect whether there is sexual harassment or not.

    目錄 第一章、 緒論 8 1.1 研究背景與動機 8 1.2 研究目的 9 1.3 研究限制 10 1.4 研究架構 10 第二章、 文獻回顧 11 2.1 性騷擾 11 2.1.1 起源與定義 11 2.1.2 造成的傷害 13 2.1.3 通報狀況 13 2.2 性騷擾防治 14 2.2.1 傳統方式 14 2.2.2 科技協作 15 2.3 語音辨識與資料擴增 16 2.3.1 語音辨識應用 16 2.3.2 語音特徵提取 17 2.3.3 資料擴增 18 2.4 卷積神經網路 19 第三章、 研究方法 23 3.1 資料蒐集與分類 23 3.2 音頻特徵提取 23 3.3 資料擴增 24 3.4 模型架構 25 3.5 模型訓練與評估 26 第四章、 個案研究 28 4.1 問題描述 28 4.2 資料蒐集與分類 29 4.3 特徵提取 31 4.4 資料擴增 32 4.5 模型訓練 34 4.5.1. 模型架構 34 4.5.2. 模型訓練 38 4.5.3. 模型結果 39 4.6 討論 42 第五章、 結論與未來發展 45 5.1 結論 45 5.2 未來發展建議 45 圖目錄 圖 1. 訓練過程之梅爾頻譜圖示意圖 18 圖 2. SpecAugment的三種不同變形作用於頻譜圖 19 圖 3. 卷積的運算方式的範例 20 圖 4. 最大化方式(Max-pooling)之運作的範例 21 圖 5. 平坦化(Flatten)過程的範例 22 圖6. 全連接層(Fully Connected Layer)的範例 22 圖 7. 樣本資料之梅爾頻譜圖 33 圖 8. 資料擴增後之梅爾頻譜圖 34 圖 9. 二維卷積神經網路之模型結構圖 36 圖 10. 一維卷積神經網路之模型結構圖 37 圖 11. 二維卷積神經網路之混淆矩陣 39 圖 12. 一維卷積神經網路之混淆矩陣 40 表目錄 表 1.混淆矩陣計算方式 27 表 2.軟體、硬體與套件介紹 29 表 3.具性騷擾防治相關領域專業人士資料 30 表 4.特徵提取之參數設置 32 表 5.資料擴增LD策略之參數設置 33 表 6.分類後的音頻資料輸入特徵範例 35 表 7.模型訓練之參數設置 38 表 8.二維卷積神經網路之精確率、召回率、F1-score、準確率 39 表 9. 一維卷積神經網路之精確率、召回率、F1-score、準確率 40 表 10.一維卷積神經網路與二維卷積神經網路之比較 41

    參考文獻
    Bauer, T., Devrim, E., Glazunov, M., Jaramillo, W. L., Mohan, B., & Spanakis, G. (2019). # MeTooMaastricht: Building a chatbot to assist survivors of sexual harassment. Joint European Conference on Machine Learning and Knowledge Discovery in Databases.
    Chandolikar, N., Joshi, C., Roy, P., Gawas, A., & Vishwakarma, M. (2022). Voice Recognition: A Comprehensive Survey. 2022 International Mobile and Embedded Technology Conference (MECON).
    Charlesworth, S., McDonald, P., & Cerise, S. (2011). Naming and claiming workplace sexual harassment in Australia. Australian Journal of Social Issues, 46(2), 141-161.
    Chawki, M., & el Shazly, Y. (2013). Online sexual harassment: Issues & solutions. Journal of Intellectual Property, Information Technology and E-Commerce Law, 4, 71.
    Dash, D. K. (2013). Govt plans panic button, app in mobiles for women's safety. https://timesofindia.indiatimes.com/tech-news/govt-plans-panic-button-app-in-mobiles-for-womens-safety/articleshow/26871825.cms
    Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(7).
    Easton, H., & Smith, F. (2003). Getting There Reducing Crime on Public Transport.
    Fitzgerald, L. F., Shullman, S. L., Bailey, N., Richards, M., Swecker, J., Gold, Y., Ormerod, M., & Weitzman, L. (1988). The incidence and dimensions of sexual harassment in academia and the workplace. Journal of Vocational Behavior, 32(2), 152-175.
    Gardner, N., Cui, J., & Coiacetto, E. (2017). Harassment on public transport and its impacts on women’s travel behaviour. Australian Planner, 54(1), 8-15.
    Gekoski, A., Gray, J. M., Horvath, M. A., Edwards, S., Emirali, A., & Adler, J. R. (2015). 'What works' in reducing sexual harassment and sexual offences on public transport nationally and internationally: a rapid evidence assessment. https://eprints.mdx.ac.uk/id/eprint/15219
    Grauerholz, E. (1989). Sexual harassment of women professors by students: Exploring the dynamics of power, authority, and gender in a university setting. Sex Roles, 21(11), 789-801.
    Gutek, B. A., & Morasch, B. (1982). Sex‐ratios, sex‐role spillover, and sexual harassment of women at work. Journal of Social Issues, 38(4), 55-74.
    Gutek, B. A., Nakamura, C. Y., Gahart, M., Handschumacher, I., & Russell, D. (1980). Sexuality and the workplace. Basic and Applied Social Psychology, 1(3), 255-265.
    Hesson‐Mcinnis, M. S., & Fitzgerald, L. F. (1997). Sexual harassment: A preliminary test of an integrative model 1. Journal of Applied Social Psychology, 27(10), 877-901.
    Horii, M., & Burgess, A. (2012). Constructing sexual risk:‘Chikan’, collapsing male authority and the emergence of women-only train carriages in Japan. Health, Risk & Society, 14(1), 41-55.
    JIJI. (2020). JR East to test new app for reporting train gropers on notorious Saikyo Line. https://www.japantimes.co.jp/news/2020/02/05/national/jr-east-test-app-notorious-saikyo-line-shares-reports-train-gropers/
    Kabir, A. T., & Tasneem, T. (2020). Safety Solution for women using Smart band and CWS App. 2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON).
    Kash, G. (2019). Always on the defensive: The effects of transit sexual assault on travel behavior and experience in Colombia and Bolivia. Journal of Transport & Health, 13, 234-246.
    Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
    Lenton, R., Smith, M. D., Fox, J., & Morra, N. (1999). Sexual harassment in public places: Experiences of Canadian women. Canadian Review Of Sociology/Revue Canadienne De Sociologie, 36(4), 517-540.
    Likitha, M., Gupta, S. R. R., Hasitha, K., & Raju, A. U. (2017). Speech based human emotion recognition using MFCC. 2017 International Conference on Wireless Communications, Signal Processing and Networking (Wispnet).
    Lim, W., Jang, D., & Lee, T. (2016). Speech emotion recognition using convolutional and recurrent neural networks. 2016 Asia-Pacific Signal And Information Processing Association Annual Summit And Conference (APSIPA).
    Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391.
    McDonald, P. (2012). Workplace sexual harassment 30 years on: A review of the literature. International Journal of Management Reviews, 14(1), 1-17.
    McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and music signal analysis in python. Proceedings of the 14th Python in Science Conference.
    Meera Senthilingam, S.-G. M. (2017). Sexual harassment: How it stands around the globe. https://edition.cnn.com/2017/11/25/health/sexual-harassment-violence-abuse-global-levels/index.html
    Meng, H., Yan, T., Yuan, F., & Wei, H. (2019). Speech emotion recognition from 3D log-mel spectrograms with deep learning network. IEEE access, 7, 125868-125881.
    Ministry of Justice, H. O. t. O. f. N. S. (2013). An Overview of Sexual Offending in England and Wales. Retrieved from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/214970/sexual-offending-overview-jan-2013.pdf
    Mujaddidurrahman, A., Ernawan, F., Wibowo, A., Sarwoko, E. A., Sugiharto, A., & Wahyudi, M. D. R. (2021). Speech Emotion Recognition Using 2D-CNN with Data Augmentation. 2021 International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM).
    O’Donohue, W., Downs, K., & Yeater, E. A. (1998). Sexual harassment: A review of the literature. Aggression and Violent Behavior, 3(2), 111-128.
    Organization, W. H. (2013). Global and regional estimates of violence against women: prevalence and health effects of intimate partner violence and non-partner sexual violence. World Health Organization.
    Paludi, M. A., & Barickman, R. B. (1991). Academic and Workplace Sexual Harassment: A Resource Manual. Suny Press.
    Park, D. S., Chan, W., Zhang, Y., Chiu, C.-C., Zoph, B., Cubuk, E. D., & Le, Q. V. (2019). Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779.
    Park, H.-A., Murray, P. J., & Delaney, C. (2006). Consumer-Centered Computer-Supported Care For Healthy People: Proceedings Of NI2006 (Vol. 122). IOS Press.
    Park, H., & Lee, J. (2020). Can a Conversational Agent Lower Sexual Violence Victims' Burden of Self-Disclosure? Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems.
    Peirce, E., Smolinski, C. A., & Rosen, B. (1998). Why sexual harassment complaints fall on deaf ears. Academy of Management Perspectives, 12(3), 41-54.
    Pina, A., Gannon, T. A., & Saunders, B. (2009). An overview of the literature on sexual harassment: Perpetrator, theory, and treatment issues. Aggression and Violent Behavior, 14(2), 126-138.
    Police, M. V. T. See Something, Say Something. https://transitpolice.ca/advice-info/see-something-say-something/
    Puri, T., Soni, M., Dhiman, G., Ibrahim Khalaf, O., & Raza Khan, I. (2022). Detection of emotion of speech for RAVDESS audio using hybrid convolution neural network. Journal of Healthcare Engineering, 2022.
    Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12(1), 145-151.
    Quinones, L. M. (2020). Sexual harassment in public transport in Bogotá. Transportation Research Part A: Policy and Practice, 139, 54-69.
    Rights, E. U. A. f. F. (2021). Crime, Safety and Victims’ Rights: Fundamental Rights Survey . Retrieved from https://fra.europa.eu/sites/default/files/fra_uploads/fra-2021-crime-safety-victims-rights_en.pdf
    Robertson, C., Dyer, C. E., & Campbell, D. A. (1988). Campus harassment: Sexual harassment policies and procedures at institutions of higher learning. Signs: Journal of Women in Culture and Society, 13(4), 792-812.
    Sakhuja, S., & Cohen, R. (2020). RideSafe: Detecting Sexual Harassment in Rideshares. Canadian Conference on Artificial Intelligence.
    Sandler, B. R., & Shoop, R. J. (1997). Sexual Harassment on Campus. A Guide for Administrators, Faculty, and Students. ERIC.
    Smith, M. J. (2008). Addressing the security needs of women passengers on public transport. Security Journal, 21(1), 117-133.
    Smith, M. J., & Clarke, R. V. (2000). Crime and public transport. Crime and Justice, 27, 169-233.
    Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4(2), 26-31.
    Till, F. J. (1980). Sexual Harassment: A Report on the Sexual Harassment of Students. US Department of Education.
    Tripathi, S., Kumar, A., Ramesh, A., Singh, C., & Yenigalla, P. (2019). Deep learning based emotion recognition system using speech features and transcriptions. arXiv preprint arXiv:1906.05681.
    Twyford, R. (2013). Report: Project guardian-public awareness campaign rationale. London: Project Guardian.
    United States Department of State • Bureau of Democracy, H. R. a. L. (2020). 2020 Country Reports on Human Rights Practices: Taiwan. https://www.state.gov/wp-content/uploads/2021/10/TAIWAN-2020-HUMAN-RIGHTS-REPORT.pdf
    Woodzicka, J. A., & LaFrance, M. (2001). Real versus imagined gender harassment. Journal of Social Issues, 57(1), 15-30.
    Yang, J., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. 2009 IEEE Conference on Computer Vision and Pattern Recognition.

    QR CODE