使用支持向量機演算法之鼻音事件偵測｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡明嘉 Tsai, Ming-Chia
論文名稱：	使用支持向量機演算法之鼻音事件偵測 Nasal event detection using support vector machine
指導教授：	王小川 Wang, Hsiao-Chuan
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2010
畢業學年度：	98
語文別：	中文
論文頁數：	44
中文關鍵詞：	鼻音偵測、聲學特徵參數、支持向量機
外文關鍵詞：	Nasal detection, Acoustic parameter, Support vector machine
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

　　新一代自動語音辨認系統採用以知識為基礎的特徵參數，對特殊音提供更具其代表性特徵參數，以提升偵測正確率。本論文探討了容易混淆的鼻音與半母音特性，利用小波轉換計算每一頻帶範圍的能量值，藉由鼻音與半母音屬於低頻成分較多的性質，進而取出其特徵參數，特徵參數包含梅爾倒頻譜係數(Mel-frequency cepstral coefficients, MFCC)、能量比值(Energy ratio)以及希爾伯轉換後的包絡線值(Hilbert envelope)變化量，比較特徵參數分離效果，再使用支持向量機(Support Vector Machine, SVM)技術達到分類的目的，將音框分類之後，可以找出鼻音的釋放(Release)以及結束(Closure)的轉換點，找出語音分段邊界，並探討此方法的準確性。實驗語料使用TIMIT語料庫，鼻音偵測正確率可達到82%，比上以HMM作音素辨識之關鍵詞檢測架構的偵測率正確率80%可高上2%，其特徵參數使用的是MFCC+△MFCC+△△MFCC+ logEnergy +△logEnergy +△△logEnergy。而以本文方法實驗的釋放與結束轉換點，其偵測的結果與手動標示的記號誤差平均分別是9.74ms及-8.9ms。在假警報率的部分分別對母音、半母音、摩擦音、塞擦音及塞音的分類而言，其百分比分別是2.4%、1%、2%、1%、及0.2%，效果顯示不會有太多誤判的情形。

第一章　緒論    1
1研究動機    1
2背景知識    2
3論文結構    3
第二章　特徵參數抽取    5
1小波轉換    5
2梅爾倒頻譜係數    10
3能量比值及差值    11
4希爾伯轉換後的包絡線值變化量測    14
第三章　支持向量機演算法    16
1簡介    16
2原始形式    17
3拉氏對偶性    20
4對偶形式    23
5核心映射    25
6不可分情形    26
第四章　實驗結果及討論    28
1實驗工具及與料庫介紹    28
1.1實驗工具    28
1.2實驗語料    28
2效能評估方式    29
3實驗方法    32
4實驗結果    35
第五章　結論    41
參考文獻    42

                                

【1】G. Castellanos, G. Daza, L. Sánchez, O. Castrillón, J. Suárez, “Acoustic Speech Analysis for Hypernasality Detection in Children,” Proceedings of the 28th IEEE EMBS Annual International Conference New York City, USA, Aug 30-Sept 3, 2006
【2】Marilyn Y. Chen, “Nasal Detection Module for a Knowledge-based Speech Recognition System,” ICSLP 2000, Vol.6,pp.636-639
【3】J, R. Glass and V. W. Zue (1986), "Signal Representation for Acoustic Segmentation", Proceedings First Australian Conference on Speech Science and Technology, November 1986, pp. 124-129.
【4】T. Pruthi and C. Espy-Wilson, “Acoustic parameters for automatic detection of nasal manner,”Speech Communication, vol. 43, no. 3, pp. 225–239, 2004.
【5】Neira Hajro, “Automated nasal feature detection for the lexical access from features project”,Massachusetts Institute of Technology, 2004
【6】Vladimir N.Vapnik, The nature of statistical learning theory, Springer-Verlag New York, Inc., New York, NY, USA, 1995.
【7】Christopher J.C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, 1998.
【8】Chih-Chung Chang and Chih-Jen Lin, LIBSVM tool version 2.89, 2009.
【9】黃鈞尉, “語音事件偵測與國語連續語音之標音”, 清華大學碩士論文民國九十七年
【10】陳錫賢, “語音特定屬性之偵測與應用”, 清華大學碩士論文民國九十五年
【11】Tarun Pruthi, Carol Y.Espy-Wilson, “Automatic Classification of Nasals and Semivowels”, 15th ICPhS Barcelona, 2003
【12】Glass, J.R. “Nasal Consonants and Nasalized Vowels:An Acoustic Study and Recognition Experiment,”Ms and EE thisis, Massachusetts Institute of Technology, Cambridge, MA, 1984
【13】Liu, S.A. “Landmark Detection for Distinctive Feature-Based Speech Recognition,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA, 1995
【14】Strvens, K.N. “Toward a model for lexical access based on acoustic landmarks and distinctive features,”Journal of the Acoustical Society of America, 111, 1892-1891, 2002
【15】Zhimin Xie and Partha Niyogi, “ Robust acoustic-based syllable detection,”In Proc. of ICSLP, 2006.
【16】Ariel Salomon, Carol Espy-Wilson, and Om Deshmukh, “ Detection of speech landmarks: Use of temporal information,” J. Acoust. Soc. Am, 115(3):1296–1305, 2004.
【17】Partha Niyogi and M. M. Sondhi, “ Detecting stop consonants in continuous speech,” J. Acoust. Soc. Am, 111(2):1063–1076, 2002.
【18】Stéphane Mallat, “ A wavelet tour of signal processing,” 2nd Edition, Academic Press, 1999

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文