基於心理聲學之聲音驗證碼及其背後之攻防｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃志翔 Huang, Chih-Hsiang
論文名稱：	基於心理聲學之聲音驗證碼及其背後之攻防 Attacking and Defending behind a Psychoacoustics-based Audio CAPTCHA
指導教授：	劉奕汶 Liu, Yi-Wen
口試委員:	陳宜欣 Chen, Yi-Shin 吳尚鴻 Wu, Shan-Hung 冀泰石 Chi, Tai-Shih
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2020
畢業學年度：	109
語文別：	英文
論文頁數：	54
中文關鍵詞：	聽覺驗證碼、心理聲學、聲音浮水印、使用者介面、聲音事件偵測、後門
外文關鍵詞：	audio CAPTCHA, Psychoacoustics, watermarks, user interface, sound event detection, backdoor
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

驗證碼是判斷使用者為真人或是自動化程式的技術，隨著科技日新月異以及自動化腳本的普及，愈來愈多的不肖使用者企圖使用這些便捷的工具快速進入主要頁面進行不當的存取。雖然大多數的網站管理者已經意識到這方面的問題並且積極採取驗證碼的技術防衛這些自動化程式的攻擊，但現有網站使用的驗證碼多是使用視覺驗證碼以及文字驗證碼，這對視力受損或是視障朋友們是非常不友善的。雖然業界（如Google）以及學術界投入心力設計聽覺驗證碼，但從安全性以及使用者便捷性這兩點來看，現有的聽覺驗證碼仍有很大的發展空間。回顧過去學者提出的聽覺驗證碼方案，有部份學者致力於設計方便視障朋友的使用者介面，有些學者則發展不同類型的聽覺驗證碼，更有部份的研究針對現有的reCAPTCHA進行攻擊分析，但卻很少有針對這三者做全面性探討的研究。本篇論文統整了這些文獻，提出一款新穎的聽覺驗證碼，針對可能的攻擊途徑做出相對應的防衛手段，在攻與防的過程中以數據量化並分析驗證碼的安全性；然而以往的驗證碼為求足夠的安全而時常犧牲使用者體驗度，例如過度的圖片與文字變形或是噪音干擾，本篇論文則透過心理聲學最小化干擾源並同時最大化安全性。而在使用者介面的設計，本篇論文則依據過去文獻提到的設計要點使用Python搭建簡易的介面以供聽測受試者使用並給予回饋。

CAPTCHA is used to distinguish if the user is genuine human or automatic script. Due to the convenience of new technology and available resource, more and more users abuse these tools for inappropriate access. Although most of the web managers have been conscious about this issue and adopt CAPTCHA against automatic attacks, most of the existing websites only plug in visual CAPTCHA or textual CAPTCHA. However, this action is unfriendly for those who are vision-impaired or blind. Although large enterprise, for example Google, and the academia have paid effort to design audio CAPTCHA, there is still a long way to go in the sight of security and user experience. Reviewing the previously proposed schemes for audio CAPTCHA, some contributed to designing a user-friendly interface, some developed different kinds of audio CAPTCHA and others conducted attacking analysis towards reCAPTCHA. However, little research leads a comprehensive discussion on all of the issues. This thesis proposes a new audio CAPTCHA based on previous research and simulates possible attacking paths with corresponding defending methods. With quantitative analysis under alternate rounds of attacking and defending, we can assess the security of our audio CAPTCHA more concretely. In most cases of previous CAPTCHAs, user experience got sacrificed in order to enhance security; examples include over distorted image/text or noise interference. In contrast, the proposed method maximizes the security and minimizes the disturbance within CAPTCHA based on psychoacoustics. Lastly, we build a simple but feasible user interface based on previous research for our listening test, which gathers the feedback from the participants.

  Introduction    1
1    Motivation    1
2    Problem Statement    1
3    Thesis Organization    2
  Related Work    3
  The proposed system    6
1    CAPTCHA design    6
2    Database    7
3    Fourier transformation    8
3.1    Discrete-time Fourier Transform (DTFT)    8
3.2    Discrete Fourier Transform (DFT)    9
3.3    Short-time Fourier Transform (STFT)    10
4    Attacking paths    12
4.1    Cross correlation attack (CCA)    14
4.2    Deep learning Attack    14
5    Defending measures    16
5.1    Phase modified CUE    19
5.2    Dummy sinusoids and watermarking    20
5.3    Auditory continuity illusion (ACI)    28
6    User interface (UI)    29
  Experiments and Results    31
1    Metric    31
2    Attacking and defending analysis    32
2.1    Cross correlation attack    32
2.2    Deep learning attack    35
2.3    Defending procedure    38
3    Listening test    40
  Conclusions    46
  Future works    47
1    Miscellaneous discussions    47
References    49
Appendix    52
A.1   Questionnaire after listening test,    52

                                

[1] Y.-W. Liu and J. O. Smith III, “Perceptually similar orthogonal sounds and applications to multichannel acoustic echo canceling,” in Audio Engineering Society Conference: 22nd International Conference: Virtual, Synthetic, and Entertainment Audio, Audio Engineering Society, 2002.
[2] S. Kulkarni and H. Fadewar, “Audio captcha techniques: A review,” in Pro- ceedings of the Second International Conference on Computational Intelli- gence and Informatics, pp. 359–368, Springer, 2018.
[3] H. Gao, H. Liu, D. Yao, X. Liu, and U. Aickelin, “An audio captcha to dis- tinguish humans from computers,” in 2010 Third International Symposium on Electronic Commerce and Security, pp. 265–269, IEEE, 2010.
[4] J. Choi, T. Oh, W. Aiken, S. Woo, and H. Kim, “Poster: I can’t hear this be- cause i am human: A novel design of audio captcha system,” in Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pp. 833–835, 05 2018.
[5] H. Meutzner, S. Gupta, and D. Kolossa, “Constructing secure audio captchas by exploiting differences between humans and machines,” in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI ’15, (New York, NY, USA), p. 2335–2338, Association for Computing Machinery, 2015.
[6] G. Kochanski, D. Lopresti, and C. Shih, “A reverse turing test using speech,” in Seventh International Conference on Spoken Language Processing, pp. 1357– 1360, Jan. 2002.
[7] J. Holman, J. Lazar, J. H. Feng, and J. D’Arcy, “Developing usable captchas for blind users,” in Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility, Assets ’07, (New York, NY, USA), p. 245–246, Association for Computing Machinery, 2007.
[8] N. Tariq and F. A. Khan, “Match-the-sound captcha,” in Information Technol- ogy - New Generations (S. Latifi, ed.), (Cham), pp. 803–808, Springer Inter- national Publishing, 2018.
[9] H. Meutzner and D. Kolossa, “A non-speech audio captcha based on acoustic event detection and classification,” in 2016 24th European Signal Processing Conference (EUSIPCO), pp. 2250–2254, 2016.
[10] J. Lazar, J. Feng, T. Brooks, G. Melamed, B. Wentz, J. Holman, A. Olalere, and
N. Ekedebe, “The soundsright captcha: an improved approach to audio human interaction proofs for blind users,” in Proceedings of the SIGCHI conference on human factors in computing systems, pp. 2267–2276, 2012.
[11] J. P. Bigham and A. C. Cavender, “Evaluating existing audio captchas and an interface optimized for non-visual use,” in Proceedings of the SIGCHI Con- ference on Human Factors in Computing Systems, pp. 1829–1838, 2009.

[12] A. V. Oppenheim and R. W. Schafer, Discrete-time signal processing. Pearson Education, 2010.
[13] Y.-W. Liu, “Lecture notes: Analysis and synthesis of digital audio signals,” October 2015.
[14] S. Adavanne, P. Pertilä, and T. Virtanen, “Sound event detection using spa- tial features and convolutional recurrent neural network,” in 2017 IEEE Inter- national Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 771–775, IEEE, 2017.
[15] S. Adavanne, A. Politis, and T. Virtanen, “Multichannel sound event detection using 3d convolutional neural networks for learning inter-channel features,” in 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, IEEE, 2018.
[16] X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint arXiv:1712.05526, 2017.
[17] G. S. Kendall, “The decorrelation of audio signals and its impact on spatial imagery,” Computer Music Journal, vol. 19, no. 4, pp. 71–87, 1995.
[18] D. Griffin and J. Lim, “Signal estimation from modified short-time fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 2, pp. 236–243, 1984.
[19] M. Bosi and R. E. Goldberg, Introduction to Digital Audio Coding and Stan- dards. USA: Kluwer Academic Publishers, 2002.
[20] E. Terhardt, “Calculating virtual pitch,” Hearing research, vol. 1, no. 2, pp. 155–182, 1979.
[21] H. Fletcher, “Auditory patterns,” Reviews of Modern Physics, vol. 12, pp. 47– 65, Jan 1940.
[22] E. Zwicker, “Subdivision of the audible frequency range into critical bands (frequenzgruppen),” The Journal of the Acoustical Society of America, vol. 33, no. 2, pp. 248–248, 1961.
[23] H. Traunmüller, “Analytical expressions for the tonotopic sensory scale,” The Journal of the Acoustical Society of America, vol. 88, no. 1, pp. 97–100, 1990.
[24] M. R. Schroeder, B. S. Atal, and J. Hall, “Optimizing digital speech coders by exploiting masking properties of the human ear,” The Journal of the Acoustical Society of America, vol. 66, no. 6, pp. 1647–1652, 1979.
[25] J. K. Bizley and Y. E. Cohen, “The what, where and how of auditory-object perception,” Nature Reviews Neuroscience, vol. 14, pp. 693–707, 2013.
[26] R. McWalter and J. H. McDermott, “Illusory sound texture reveals multi- second statistical completion in auditory scene analysis,” Nature Communi- cations, vol. 10, no. 1, pp. 1–18, 2019.

[27] L. Riecke, A. J. van Opstal, and E. Formisano, “The auditory continuity illusion: A parametric investigation and filter model,” Perception & Psy- chophysics, vol. 70, no. 1, pp. 1–12, 2008.
[28] N. Roshanbin and J. Miller, “A survey and analysis of current captcha ap- proaches.,” Journal of Web Engineering, vol. 12, no. 1&2, pp. 1–40, 2013.
[29] J. W. Shipman, “Tkinter 8.4 reference: a GUI for python,” New Mexico Tech Computer Center, 2013.
[30] M.-V. Laitinen, S. Disch, and V. Pulkki, “Sensitivity of human hearing to changes in phase spectrum,” Journal of the Audio Engineering Society, vol. 61, no. 11, pp. 860–877, 2013.
[31] K. K. Paliwal and L. D. Alsteris, “On the usefulness of stft phase spectrum in human listening tests,” Speech Communication, vol. 45, no. 2, pp. 153–170, 2005.

簡易檢索 / 詳目顯示

相關論文