研究生: |
黃士旗 Shih Chih Huang |
---|---|
論文名稱: |
中文語音聲調辨識的改良與錯誤分析 Improvement and Error Analysis of Tone Recognition for Mandarin Chinese |
指導教授: | 張智星 |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 中文 |
論文頁數: | 62 |
中文關鍵詞: | 中文聲調辨識 |
外文關鍵詞: | Tone Recognition, Mandarin |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
中文語音聲調辨識是音訊處理上重要的一門學問,影響中文語音聲調最直接的因素便是音高軌跡。本論文從定義音高軌跡的各項特徵開始,除了音高軌跡之外,中文語音聲調也受各地區的發聲特性及變調規則的影響而造成聲調上的變化。因此我們也加入聲調與前後聲調的相關特徵。嘗試利用多種的特徵參數將中文語音聲調模型化。
在找出聲調相關的特徵參數之後,本論文採用兩個常見的分類器來進行聲調辨識。第一種分類器是基於高斯混合模型為基礎分別訓練出每種聲調模型,而第二種分類器則採用支撐向量機演算法,找出一組適當的超平面以進行聲調分類。此外,我們也加入特徵選取方法來降低資料維度並觀察聲調辨識率是否有顯著的變化。
我們採用兩種語料庫進行實驗分析與驗證,分別為Corpus455語料庫(單人,男性)及唐詩語料庫(多人,2位女性及8位男性)的語音資料庫。實驗結果指出,對於Corpus455語料庫,其特徵維度由30維降至8維,使用高斯混合模型及支撐向量機的辨識率分別提高6.05%與1.70%;而對於唐詩語料庫,其特徵維度亦由30維降至10維,辨識率在使用高斯混合模型時略為降低0.61%,而使用支撐向量機辨識率則顯著提高,由62.05%改進至72.49%,提高10.43%。
Mandarin is a tonal language, in which each syllable is assigned a tone (a total of five tone types). In general, the tonality of a Mandarin syllable is characterized by its corresponding pitch contour. In view of this, we adopt several acoustic features related to pitch information in this study. Besides, since tone is usually influenced by different pronunciations and the sandhi rules, we accordingly add inter-syllabic acoustic features.
Once these features are available, we apply two popular classifiers, Gaussian mixture model (GMM) and support vector machine (SVM) to proceed with the tone recognition. In addition, we also try to use the sequential floating search method (SFSM) to perform feature selection.
In this study, two datasets, Corpus455 and TangPoem, are used to conduct several experiments. The experimental results indicate that the number of dimensionality is reduced from 30 to 8 for Corpus455 database, whereas it is reduced from 30 to 10 for TangPoem while SFSM is adopted. The tone recognition rates of Corpus455 by using GMM+SFSM and SVM+SFSM are promoted about 6.05% and 1.70% respectively as compared with using GMM and SVM only. Similarly, the tone recognition rates of TangPoem are changed about -0.61% and 10.43%.
【1】 D. Talkin, "A robust algorithm for pitch tracking (RAPT)", in Speech Coding and Synthesis, Amsterdam, NL: Elsevier Science, pp. 495-518, 1995.
【2】 陳寶如,普通話語音,廣東人民出版社出版,pp. 109,1993年5月2刷。
【3】 F. Plante, G. Meyer WA Ainsworth. “A pitch extraction reference database”. In Proc. EUROSPEECH, Madrid, Spain, pp. 837—840, 1995
【4】 C. Cortes, and V. Vapnik. Support-vector network. Machine Learning 20, pp. 273-297, 1995.
【5】 Sin-Horng Chen, and Yih-Ru Wang. “Tone recognition of continuous Mandarin speech based on neural networks”. Proc. of International Symposium on Artificial Neural Networks, pp. F01-F10, 1993.
【6】 楊文宏,”中文語音聲調辨識演算法”,交大碩士論文,1993年。
【7】 Wan-Yi Lin, and Lin-Shan Lee , “Improved tone recognition for fluent Mandarin speech based on new inter-syllabic features and robust pitch extraction”. IEEE 8th Automatic Speech Recognition and Understanding Workshop, PP.237-242.
【8】 P. Pudil, J. Navovicova, and J. Kittler, “Floating search methods in feature selection”, Pattern Recognition Letters, vol. 15 , pp 1119-1125, 1994.
【9】 徐光輝,“國語語音資料庫MAT-2000上的聲調辨認研究”,清華大學碩士論文,2000年
【10】 S. Liu et al. “The effect of fundamental frequency on Mandarin speech recognition”. In Proc. ICSLP98, vol. 6, pp. 2647-2650, 1998.
【11】 李俊毅,”語音評分”,清大碩士論文,2002年。
【12】 Press, William H., Numerical Recipes in C, The Art of Scientific Computing, Cambridge University Press, 1992.
【13】 馮勇強等,”漢語話語音節時長統計分析”,微軟中國研究院,2001。
【14】 Dinoj Surendran, Gina-Anne Levow, Yi Xu., “Tone recognition in Mandarin using focus”, Proceedings of EUROSPEECH 2005
【15】 Hank Chang-Han Huang, and Seide F., “Pitch tracking and tone features for Mandarin speech recognition” ICASSP. Proc. vol. 3, pp. 1523-1526, 2000.
【16】 線性識別分析,http://neural.cs.nthu.edu.tw/jang/books/dcpr/doc/pca.pdf
【17】 楊敦翔,“以類神經網路與特徵選取技巧處理空氣能見度預測問題之研究”,中山大學碩士論文,2003年。
【18】 林長青,”支撐向量機應用於科學探索”,雲科大碩士論文,2003年。
【19】 Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm