研究生: |
黃志翔 Huang, Chih-Hsiang |
---|---|
論文名稱: |
基於心理聲學之聲音驗證碼及其背後之攻防 Attacking and Defending behind a Psychoacoustics-based Audio CAPTCHA |
指導教授: |
劉奕汶
Liu, Yi-Wen |
口試委員: |
陳宜欣
Chen, Yi-Shin 吳尚鴻 Wu, Shan-Hung 冀泰石 Chi, Tai-Shih |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2020 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 54 |
中文關鍵詞: | 聽覺驗證碼 、心理聲學 、聲音浮水印 、使用者介面 、聲音事件偵測 、後門 |
外文關鍵詞: | audio CAPTCHA, Psychoacoustics, watermarks, user interface, sound event detection, backdoor |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
驗證碼是判斷使用者為真人或是自動化程式的技術,隨著科技日新月異以及自動化腳本的普及,愈來愈多的不肖使用者企圖使用這些便捷的工具快速進入主要頁面進行不當的存取。雖然大多數的網站管理者已經意識到這方面的問題並且積極採取驗證碼的技術防衛這些自動化程式的攻擊,但現有網站使用的驗證碼多是使用視覺驗證碼以及文字驗證碼,這對視力受損或是視障朋友們是非常不友善的。雖然業界(如Google)以及學術界投入心力設計聽覺驗證碼,但從安全性以及使用者便捷性這兩點來看,現有的聽覺驗證碼仍有很大的發展空間。回顧過去學者提出的聽覺驗證碼方案,有部份學者致力於設計方便視障朋友的使用者介面,有些學者則發展不同類型的聽覺驗證碼,更有部份的研究針對現有的reCAPTCHA進行攻擊分析,但卻很少有針對這三者做全面性探討的研究。本篇論文統整了這些文獻,提出一款新穎的聽覺驗證碼,針對可能的攻擊途徑做出相對應的防衛手段,在攻與防的過程中以數據量化並分析驗證碼的安全性;然而以往的驗證碼為求足夠的安全而時常犧牲使用者體驗度,例如過度的圖片與文字變形或是噪音干擾,本篇論文則透過心理聲學最小化干擾源並同時最大化安全性。而在使用者介面的設計,本篇論文則依據過去文獻提到的設計要點使用Python搭建簡易的介面以供聽測受試者使用並給予回饋。
CAPTCHA is used to distinguish if the user is genuine human or automatic script. Due to the convenience of new technology and available resource, more and more users abuse these tools for inappropriate access. Although most of the web managers have been conscious about this issue and adopt CAPTCHA against automatic attacks, most of the existing websites only plug in visual CAPTCHA or textual CAPTCHA. However, this action is unfriendly for those who are vision-impaired or blind. Although large enterprise, for example Google, and the academia have paid effort to design audio CAPTCHA, there is still a long way to go in the sight of security and user experience. Reviewing the previously proposed schemes for audio CAPTCHA, some contributed to designing a user-friendly interface, some developed different kinds of audio CAPTCHA and others conducted attacking analysis towards reCAPTCHA. However, little research leads a comprehensive discussion on all of the issues. This thesis proposes a new audio CAPTCHA based on previous research and simulates possible attacking paths with corresponding defending methods. With quantitative analysis under alternate rounds of attacking and defending, we can assess the security of our audio CAPTCHA more concretely. In most cases of previous CAPTCHAs, user experience got sacrificed in order to enhance security; examples include over distorted image/text or noise interference. In contrast, the proposed method maximizes the security and minimizes the disturbance within CAPTCHA based on psychoacoustics. Lastly, we build a simple but feasible user interface based on previous research for our listening test, which gathers the feedback from the participants.
[1] Y.-W. Liu and J. O. Smith III, “Perceptually similar orthogonal sounds and applications to multichannel acoustic echo canceling,” in Audio Engineering Society Conference: 22nd International Conference: Virtual, Synthetic, and Entertainment Audio, Audio Engineering Society, 2002.
[2] S. Kulkarni and H. Fadewar, “Audio captcha techniques: A review,” in Pro- ceedings of the Second International Conference on Computational Intelli- gence and Informatics, pp. 359–368, Springer, 2018.
[3] H. Gao, H. Liu, D. Yao, X. Liu, and U. Aickelin, “An audio captcha to dis- tinguish humans from computers,” in 2010 Third International Symposium on Electronic Commerce and Security, pp. 265–269, IEEE, 2010.
[4] J. Choi, T. Oh, W. Aiken, S. Woo, and H. Kim, “Poster: I can’t hear this be- cause i am human: A novel design of audio captcha system,” in Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pp. 833–835, 05 2018.
[5] H. Meutzner, S. Gupta, and D. Kolossa, “Constructing secure audio captchas by exploiting differences between humans and machines,” in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI ’15, (New York, NY, USA), p. 2335–2338, Association for Computing Machinery, 2015.
[6] G. Kochanski, D. Lopresti, and C. Shih, “A reverse turing test using speech,” in Seventh International Conference on Spoken Language Processing, pp. 1357– 1360, Jan. 2002.
[7] J. Holman, J. Lazar, J. H. Feng, and J. D’Arcy, “Developing usable captchas for blind users,” in Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility, Assets ’07, (New York, NY, USA), p. 245–246, Association for Computing Machinery, 2007.
[8] N. Tariq and F. A. Khan, “Match-the-sound captcha,” in Information Technol- ogy - New Generations (S. Latifi, ed.), (Cham), pp. 803–808, Springer Inter- national Publishing, 2018.
[9] H. Meutzner and D. Kolossa, “A non-speech audio captcha based on acoustic event detection and classification,” in 2016 24th European Signal Processing Conference (EUSIPCO), pp. 2250–2254, 2016.
[10] J. Lazar, J. Feng, T. Brooks, G. Melamed, B. Wentz, J. Holman, A. Olalere, and
N. Ekedebe, “The soundsright captcha: an improved approach to audio human interaction proofs for blind users,” in Proceedings of the SIGCHI conference on human factors in computing systems, pp. 2267–2276, 2012.
[11] J. P. Bigham and A. C. Cavender, “Evaluating existing audio captchas and an interface optimized for non-visual use,” in Proceedings of the SIGCHI Con- ference on Human Factors in Computing Systems, pp. 1829–1838, 2009.
[12] A. V. Oppenheim and R. W. Schafer, Discrete-time signal processing. Pearson Education, 2010.
[13] Y.-W. Liu, “Lecture notes: Analysis and synthesis of digital audio signals,” October 2015.
[14] S. Adavanne, P. Pertilä, and T. Virtanen, “Sound event detection using spa- tial features and convolutional recurrent neural network,” in 2017 IEEE Inter- national Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 771–775, IEEE, 2017.
[15] S. Adavanne, A. Politis, and T. Virtanen, “Multichannel sound event detection using 3d convolutional neural networks for learning inter-channel features,” in 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, IEEE, 2018.
[16] X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint arXiv:1712.05526, 2017.
[17] G. S. Kendall, “The decorrelation of audio signals and its impact on spatial imagery,” Computer Music Journal, vol. 19, no. 4, pp. 71–87, 1995.
[18] D. Griffin and J. Lim, “Signal estimation from modified short-time fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 2, pp. 236–243, 1984.
[19] M. Bosi and R. E. Goldberg, Introduction to Digital Audio Coding and Stan- dards. USA: Kluwer Academic Publishers, 2002.
[20] E. Terhardt, “Calculating virtual pitch,” Hearing research, vol. 1, no. 2, pp. 155–182, 1979.
[21] H. Fletcher, “Auditory patterns,” Reviews of Modern Physics, vol. 12, pp. 47– 65, Jan 1940.
[22] E. Zwicker, “Subdivision of the audible frequency range into critical bands (frequenzgruppen),” The Journal of the Acoustical Society of America, vol. 33, no. 2, pp. 248–248, 1961.
[23] H. Traunmüller, “Analytical expressions for the tonotopic sensory scale,” The Journal of the Acoustical Society of America, vol. 88, no. 1, pp. 97–100, 1990.
[24] M. R. Schroeder, B. S. Atal, and J. Hall, “Optimizing digital speech coders by exploiting masking properties of the human ear,” The Journal of the Acoustical Society of America, vol. 66, no. 6, pp. 1647–1652, 1979.
[25] J. K. Bizley and Y. E. Cohen, “The what, where and how of auditory-object perception,” Nature Reviews Neuroscience, vol. 14, pp. 693–707, 2013.
[26] R. McWalter and J. H. McDermott, “Illusory sound texture reveals multi- second statistical completion in auditory scene analysis,” Nature Communi- cations, vol. 10, no. 1, pp. 1–18, 2019.
[27] L. Riecke, A. J. van Opstal, and E. Formisano, “The auditory continuity illusion: A parametric investigation and filter model,” Perception & Psy- chophysics, vol. 70, no. 1, pp. 1–12, 2008.
[28] N. Roshanbin and J. Miller, “A survey and analysis of current captcha ap- proaches.,” Journal of Web Engineering, vol. 12, no. 1&2, pp. 1–40, 2013.
[29] J. W. Shipman, “Tkinter 8.4 reference: a GUI for python,” New Mexico Tech Computer Center, 2013.
[30] M.-V. Laitinen, S. Disch, and V. Pulkki, “Sensitivity of human hearing to changes in phase spectrum,” Journal of the Audio Engineering Society, vol. 61, no. 11, pp. 860–877, 2013.
[31] K. K. Paliwal and L. D. Alsteris, “On the usefulness of stft phase spectrum in human listening tests,” Speech Communication, vol. 45, no. 2, pp. 153–170, 2005.