口說英語重音辨識之初步研究｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	曾璟鈺 Tseng, Ching-Yu
論文名稱：	口說英語重音辨識之初步研究 An Initial Study on Stress Detection for Spoken English
指導教授：	張智星 Jang, Jyh-Shing Roger
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2009
畢業學年度：	97
語文別：	中文
論文頁數：	61
中文關鍵詞：	重音辨識、兩步驟分類
外文關鍵詞：	Stress Detection, two-stage classification
相關次數：	點閱：71 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本篇論文的研究主旨是對於多音節之英語詞彙進行重音辨識。主要使用兩種辨識方法，「一維特徵參數的辨識方法」以及「兩步驟分類的辨識方法」。兩個方法所需要的輸入資料為單一個多音節詞彙的語音資料，經由強制對位產生子音與母音音素，再對各因素取得音高、音量向量以及持續時間。其中母音音素的資訊可以代表一個音節。最終需要得到的結果即一個詞彙只能有一個母音被標記為重音。
第一種方法是對一個詞彙的母音音素(亦代表音節)取出音高向量(pitch vector)和音量向量(volume vector)，再使用不同的計算方法，分別為中位數、平均值、最大值、導函數取最大值(maximum of derivative)、導函數取中位數、第一四分位數和第三四分位數，將音高及音量向量轉換成數值；因此，每種計算方法均會產生一個特徵參數值。最後，直接使用單一特徵值（如：各母音的音高中位數）來辨識一個詞彙的重音。
第二種方法分為兩個步驟，第一步驟是使用高斯混合模型(Gaussian Mixture Model, GMM)對各個母音音素(亦代表音節)的特徵參數做分類，分出重音與非重音兩類。因為一個詞彙只能有一個主要的重音音節，因此第二步驟主要對n個音節詞彙進行n類的分類問題來決定重音音節位於第幾個音節，例如2個音節詞彙可分為第一個音節為重音以及第二個音節為重音2類。而第二步的特徵是使用第一步驟所產生的各詞彙重音與非重音之log likelihood。
在本論文的實驗中，第一種辨識方法進行了8組實驗，而最佳辨識率為82.58%，使用的單一特徵為對音高向量取中位數。第二種辨識方法進行了11組實驗，第二步驟對2、3、4個音節詞彙進行分類，最佳辨識率分別為90.36%、86.85%、85.65%。比第一種方法提高約3~7%。本實驗結果顯示，我們提出的方法，可以有效地使用音高、音量和持續時間，成功辨識出口說英語重音。

第1章 諸論    1
1 研究主題    1
2 相關研究    1
3 本論文方法簡介與主要成果    2
4 章節概要    3
第2章 英語詞彙重音辨識之研究    4
1 問題定義    4
2 系統架構及流程    4
3 研究方法    8
3.1 各詞彙正確重音音節之前置處理    8
3.2 特徵擷取與正規化    8
3.3 分類方法    9
3.4 辨識方法    13
第3章 實驗結果與討論分析    16
1 資料庫說明    16
2 實驗一：使用單一特徵值辨識方法    16
3 實驗二：使用多維特徵值各組合進行2步驟分類    17
3.1 2維特徵參數組合    17
3.2 3維特徵參數組合    28
3.3 6維特徵參數組合    37
3.4 9維特徵參數組合    40
4 實驗分析    42
5 錯誤分析    45
第4章 結論與未來工作    55
參考文獻    56

                                

[1]D. Wang and S. Narayanan, “An acoustic measure for word prominence in spontaneous speech,” IEEE Trans. Speech, Audio, Language Process., vol. 15, no. 2, pp. 690–701, Feb. 2007.
[2]F. Tamburini and C. Caini, “An automatic system for detecting prosodic prominence in American English continuous speech,” Int. J. Speech Technol., vol. 8, pp. 33–44, 2005.
[3]Jenkin, K.L. and Scordilis M.S., “Development and comparison of three syllable stress classifiers,” ICSLP ‘96 roceedings, Philadelphia, USA, pp. 733–736.
[4]Huayang Xie, Peter Andreae, Mengjie Zhang, and Paul Warren, “Detecting stress in spoken English using decision trees and support vector machines,” in Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation. 2004, pp. 145–150, Australian Computer Society, Inc.
[5]J. Tepeerman and S. Narayanan, “Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners,” in Proc. Intl. Conf. on Acoustics, Speech and Signal Processing, Philadelphia, March 2005.
[6]C. Wang and S. Seneff, “Lexical stress modeling for improved speech recognition of spontaneous telephone speech in the JUPITER domain,” in Proc. 7th Eur. Conf. Speech Communication and Technology (EUROSPEECH ’01), vol. 4, Aalborg, Denmark, September 2001, pp. 2761–2764.
[7]Jyh-Shing Roger Jang, “DCPR (Data Clustering and Pattern Recognition) Toolbox”, available from the link at the author's homepage at "http://www.cs.nthu.edu.tw/~jang".
[8]Zhi-Sheng Chen, Jia-Min Zen, Jyh-Shing Roger Jang, and Liang-Yu Chen, “A Two-stage Classification Framework for Stress Detection in English Word Utterances”, April, 2009
[9]高斯混合模型(GMM), http://neural.cs.nthu.edu.tw/jang/books/dcpr/doc/08gmm.pdf
[10]Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines,2001. Software available at
[11]林長青, “支撐向量機應用於科學探索”, 雲科大碩士論文, 2003.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文