研究生: |
吳乙彤 Wu, Yi-Tung |
---|---|
論文名稱: |
基於語音特徵判斷語句內容真實性 Determining the authenticity of speech content via analyzing the voice characteristics |
指導教授: |
劉奕汶
Liu, Yi-Wen |
口試委員: |
白明憲
Bai, Ming-Sian 李祈均 Lee, Chi-Chun |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2018 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 52 |
中文關鍵詞: | 測謊 、決策樹 、語音特徵 |
外文關鍵詞: | deception, speech, decision tree |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今社會中,詐騙事件層出不窮,其中以電話進行詐騙佔較大多數。如果能對詐騙電話的語音內容進行探討,分析一段虛構內容的語音特徵,進一步判別此通電話之目的真實性,就可能得以幫助預防電話詐騙的發生。截至目前為止,詐騙相關的研究愈來愈受到重視,故我們由國立清華大學的學生中招募有意願參與本實驗的受試者,並設計一份問卷及流程,以遊戲的形式搜集受試者說謊與說實話的語音資料,進一步建立謊言辨識語音資料庫並進行相關研究。本研究針對受試者回答的錄音內容以數位訊號處理方法做語音特徵分析,最後搭配決策樹學習的訓練模型,對一段未知的語音藉由特定的語音特徵辨別出真偽。本研究亦根據個人特徵重要度的不同建構個人化模型以及大眾化模型,使得模型能因應不同人重要特徵不同的差異,進而得到效能較好且一般化能力較高的模型,並嘗試藉此將受試者進行行為群聚性分析。除了列出機器學習的成果並比較特徵選取前後辨識率之差異以外,我們亦根據目前研究的結果,提出未來能繼續改善、增進的方向,例如加入更多特徵如笑聲辨識、聲音明亮度等等,亦會找出實現特徵權重分配更好的方法,以提升大眾化模型之效能。
In the society, scams are everywhere, and the most common way to fraud is phone scam. If we can determine the authenticity of phone call contents by analyzing the characteristics of fake speech, it will help preventing phone scams. So far, deception-related research has received more and more attention. In this research, we recruited the students from National Tsing Hua University to become subjects, and collected speech data containing truths and lies in the form of a game. A questionnaire was designed and processed, so that the ground truth can be labeled for the entire database. Then, we analyze the recorded speech data of subjects by using digital signal processing methods. Finally, using decision tree learning technologies, we aim to develop an algorithm to determine the authenticity of speech content automatically. In our work, we also construct the personal model and the general model based on the importance of individual characteristics, so that the model can adapt the differences between important characteristics of individuals, and then obtain the model with better performance and higher generalization ability. Furthermore, we try to analyze that if subjects’ behavior has a tendency to cluster when they are lying. In addition to listing the results of machine learning tests and compare the difference before and after feature selection, we also put forward the future work according to the current results. One possible direction would be to involve more features, like laugh detection, tone, and so on. Also, it might be possible to search for a better way to implement feature weighting and improve the efficacy of the general model.
[1] P. Boersma, “Praat, a system for doing phonetics by computer,” Glot International 5:9/10, pp. 341-345, 2001.
[2] Praatio:Tim Mahrt. PraatIO. https://github.com/timmahrt/praatIO, 2016
[3] 易作霖1920《國音學講義》,商務印書館。
[4] J. P. Burg, “Maximum entropy spectral analysis,” Annual International Meeting, Soc. of Explor. Geophys., Oct., 1967.
[5] H. Gray and D. Y. Wong, “The Burg algorithm for LPC analysis/synthesis,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 28, no. 6, pp. 609-615, Dec. 1980.
[6] N. Levinson, "The Wiener RMS error criterion in filter design and prediction," J. Math. Phys., vol. 25, pp. 261–278, 1947.
[7] J. Durbin, "The fitting of time series models," Rev. Inst. Int. Stat., vol. 28, pp. 233–243, 1960.
[8] F. Enos, E. Shriberg, M. Graciarena, J. Hirschberg, and A. Stolcke, “Detecting deception using critical segments,” In Eighth Annual Conference of the International Speech Communication Association, 2007.
[9] P. Ekman. Telling Lies: Clues to Deceit in the Marketplace, Politics, and Marriage. New York: W. W. Norton & Company, 1985.
[10] B. M. DePaulo, J. J. Lindsay, B. E. Malone, L. Muhlenbruck, K. Charlston, and H. Cooper, “Cues to deception,” Psychological Bulletin, vol. 129, no. 1, pp. 74–118, 2003.
[11] L. A. Streeter, R. M. Krauss, V. Geller, C. Olson, and W. Apple, “Pitch changes during attempted deception,” Journal of Personality and Social Psychology, vol. 35, no. 5, pp. 345-350, 1977.
[12] K. R. Scherer, “Vocal indicators of stress,” Speech Evaluation in Psychiatry, pp. 171-187, 1981.
[13] P. Ekman, M. O'Sullivan, W. V. Friesen, and K. R. Scherer, “Face, voice and body in detecting deceit,” Journal of Nonverbal Behavior, vol. 15, pp. 125-135, 1991.
[14] Anolli, Luigi, and Rita Ciceri, “The voice of deception: Vocal strategies of naive and able liars,” Journal of Nonverbal Behavior, vol. 21, no. 4, pp. 259-284, 1997.
[15] P. Boersma, “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound,” Proceedings of the Institute of Phonetic Sciences. vol. 17. no. 1193, pp. 97-110, 1993.
[16] K. Spence, J. Arciuli, and G. Villar. “The role of pitch and speech rate as markers of deception in Italian speech,” Australasian International Conference on Speech Science and Technology, 2012.
[17] M. E. Gadallah, M. A. Matar, and A. F. Algezawi, “Speech based automatic lie detection,” IEEE Proceedings of the Sixteenth National Radio Science Conference, 1999.
[18] D. A. Sauter, F. Eisner, A. J. Calder, and S. K. Scott, “Perceptual cues in nonverbal vocal expressions of emotion,” The Quarterly Journal of Experimental Psychology, vol. 63, no. 11, pp. 2251-2272, 2010.
[19] F. Pedregosa, et al., "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, 12(Oct), pp. 2825-2830, 2011.
[20] J. R. Quinlan, “Discovering rules by induction from large collections of examples,” Expert Systems in the Micro-electronic Age, 1979.
[21] J. R. Quinlan. Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
[22] L. Breiman. Classification and Regression Trees. Routledge, 2007.
[23] G. V. Kass, “An exploratory technique for investigating large quantities of categorical data,” Applied Statistics, vol. 29, no. 2, pp. 119-127, 1980.
[24] J. F. Box, “Guinness, Gosset, Fisher, and Small Samples,” Statistical Science, vol. 2, no. 1, pp. 45–52, 1987.
[25] W. S. Gosset, “The probable error of a mean,” Biometrika, pp. 1-25, 1908.
[26] E. Jones, T. Oliphant, P. Peterson, et al, SciPy: Open Source Scientific Tools for Python," 2001-.
[27] H. Levene, “Robust tests for equality of variances,” Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, pp. 278–292, 1960.
[28] B. L. Welch, “The generalization of “Student's” problem when several different population variances are involved,” Biometrika, vol. 34, no. 1-2, pp. 28–35,1947.
[29] G. D. Ruxton, “The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test,” Behavioral Ecology, vol. 17, pp. 688–690, 2006.
[30] B. Derrick, D. Toher, and P. White, “Why Welch’s test is Type I error robust,” The Quantitative Methods for Psychology, vol. 12, no. 1, pp. 30–38, 2016.
[31] B. Fadem. High-Yield Behavioral Science. Pennsylvania: Lippincott Williams and Wilkins, 2008.
[32] F. R. Spellman and N. E. Whiting, Handbook of Mathematics and Statistics for the Environment. Florida: Chemical Rubber Company Press, 2013.
[33] H. Van Emden. Statistics for terrified biologists. New Jersey: John Wiley and Sons, 2012.
[34] R. L. Wasserstein and N. A. Lazar, “The ASA’s statement on p-values: context, process, and purpose,” The American Statistician, vol. 70, no. 2, pp. 129-133, 2016.
[35] G. Punj, D. W. Stewart, “Cluster analysis in marketing research: Review and suggestions for application,” Journal of Marketing Research, pp. 134-148, 1983.
[36] S. Wold, K. Esbensen and P. Geladi, “Principal component analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 2, no. 1-3, pp. 37-52, 1987.
[37] L. Rokach, O. Maimon, “Clustering methods,” Data Mining and Knowledge Discovery Handbook, Springer US, pp. 321-352, 2005.
[38] G. J. Szekely, M. L. Rizzo, “Hierarchical clustering via joint between-within distances: Extending Ward's minimum variance method,” Journal of Classification, vol. 22, no. 2, pp. 151-183, 2005.
[39] D. Müllner, “Modern hierarchical, agglomerative clustering algorithms,” arXiv preprint arXiv:1109.2378, 2011.
[40] S. Varma, R. Simon, “Bias in error estimation when using cross-validation for model selection,” BMC bioinformatics, vol. 7, no. 1, 2006.
[41] W. M. Marston, “Systolic blood pressure symptoms of deception,” Journal of Experimental Psychology, vol. 2, no. 2, pp. 117-163, 1917.
[42] G. Villar, J. Arciuli, and H. Paterson, “Vocal pitch production during lying: Beliefs about deception matter,” Psychiatry, Psychology and Law, vol. 20, no. 1, pp. 123-132, 2013.
[43] P. Ekman, M. Sullivan, W. Friesen, and K. Scherer, “Face, voice, and body in detecting deception,” Journal of Nonverbal Behavior, vol. 15, no. 2, pp. 125–135, 1991.
[44] National Research Council, The Polygraph and Lie Detection, Washington, D.C.: National Academies Press, 2003.
[45] M. B. Brown, and A. B. Forsythe, “Robust tests for the equality of variances,” Journal of the American Statistical Association, vol. 69, no. 346, pp. 364-367, 1974.