簡易檢索 / 詳目顯示

研究生: 董姵汝
Tung, Pei-Ju
論文名稱: 使用音高資訊來改進日文發音評量
Improving Japanese Pronunciation Assessment by Utilizing Pitch Information
指導教授: 張智星
Jang, Jyh-Shing Roger
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 41
中文關鍵詞: 發音評量音高資訊電腦輔助發音訓練電腦輔助語言學習
外文關鍵詞: pronunciation assessment, pitch information, CAPT, CALL
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文主旨是以加入音高資訊來改進日文發音評量,並使用評量相關的量測方法測試改良後的效能。
    我們首先加入梅爾倒頻譜係數 (Mel-frequency cepstral coefficients,MFCCs) 和對數能量 (log energy) 特徵,並且利用系統化調整標音的步驟,以更貼近真實發音的標音訓練出基礎語音模型;接著除了 MFCCs 和對數能量,我們再加入音高特徵,用以改良基礎模型,其中音高擷取我們使用 ACF (autocorrelation function) 及 UPDUDP (unbroken pitch determination using dynamic programming) 兩種音高追蹤方法,分別擷取出非連續音高 (broken pitch) 及連續音高 (unbroken pitch)。
    為測試改良後模型應用在發音評量的效能,我們使用兩種評量相關的測試方法,分別是以排名為基礎的信心度量測和發音錯誤偵測。經實驗,改良後模型的整體評量效能優於基礎語音模型,但其中並非所有音素皆適用加入音高特徵,因此我們再實驗選擇性的載入包含音高特徵的模型或是基礎模型,結果顯示,相較於非選擇性載入模型亦有微幅的評量效能提升。


    The aim of this work is to improve Japanese pronunciation assessment by utilizing pitch information, and the performance of the proposed method is evaluated against several performance measures.
    Firstly the baseline models are constructed by using MFCCs (Mel-frequency cepstral coefficients) as well as the log energy. The transcriptions are adjusted systematically due to the unique property of Japanese pronunciation. Then we train the improved acoustic models, called pitch-added models, with MFCCs, log energy and pitch. ACF (autocorrelation function) and UPDUDP (unbroken pitch determination using dynamic programming) are adopted as the pitch extraction method to generate a broken pitch contour and an unbroken pitch contour respectively.
    The performance of the proposed method is evaluated by using ranking-based confidence measure and pronunciation error detection. Experimental results show that the proposed method outperforms the baseline. However, unvoiced phonemes are considered to have no pitch values. It is therefore we try to load the models selectively between the pitch-added models and the original ones, and the experimental results show a slight improvement of the selective approach than the non-selective approach.

    摘要 i Abstract ii 致謝 iii 目錄 iv 表目錄 vii 圖目錄 viii 第1章 1 1.1 簡介 1 1.2 日文發音單位:莫拉 3 1.3 日文重音 4 第2章 相關研究 5 2.1 根基於自動語音辨識的電腦輔助發音訓練 5 2.1.1 發音評分 5 2.1.2 發音偵錯 6 2.2 語音特徵與模型 7 2.2.1 音高特徵應用 7 2.2.2 音高特徵擷取方法 8 2.2.3 語音模型相關研究 8 第3章 論文方法 9 3.1 訓練語料簡介 9 3.2 基礎語音模型 10 3.2.1 語料標音問題 10 3.2.2 建立基礎之語音模型 11 3.3 加入音高特徵之語音模型 15 3.3.1 多重語音特徵簡介 15 3.3.2 建立加入音高特徵模型 20 第4章 實驗方法及結果與分析 21 4.1 實驗語料簡介 21 4.2 實驗方法 22 4.2.1 以排名為基礎的信心度量測 22 4.2.2 發音錯誤偵測 23 4.3 實驗1:加入連續與非連續音高特徵模型之比較 27 4.3.1 實驗目的 27 4.3.2 實驗流程與設定 27 4.3.3 實驗結果與分析 28 4.4 實驗2:基礎與加入音高模型之比較 30 4.4.1 實驗目的 30 4.4.2 實驗流程與設定 30 4.4.3 實驗結果 31 4.4.4 錯誤分析 32 4.5 實驗3:選擇性載入音高特徵模型 35 4.5.1 實驗目的 35 4.5.2 實驗流程與設定 35 4.5.3 實驗結果與分析 36 第5章 38 5.1 結論 38 5.2 未來研究方向 39 參考文獻 40

    【1】 艾爾科技 MyCT、MyET 自動語音分析系統 (Automatic Speech Analysis System)
    http://www.myet.com/MyETWeb/PersonalizedPage.asp
    【2】 日文音節的單位:莫拉
    http://sp.cis.iwate-u.ac.jp/sp/lessonj/doc/mora.
    【3】 Japanese Word Accent http://sp.cis.iwate-u.ac.jp/sp/lessonj/doc/accent.html
    【4】 KIM, Y., FRANCO, H., AND NEUMEYER, L., “Automatic Pronunciation Scoring of Specific phoneme Segments for Language Instruction”, in Proceedings of the 4th European Conferaence on Speech Communication and Technology, pp. 649-652, Rhodes, 1997.
    【5】 JANG, J.S.R., CHEN, J.C., AND TSAI, T.L., ”Automatic Pronunciation Assessment for Mandarin Chinese : Approach and System Overview”, Computational Linguistics and Chinese Language Processing, 2007.
    【6】 JANG J.S.R., SUN, C.T., AND MIZUTANI, E., “Neural-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence,” Prentice Hall PTR, Upper Saddle River, New Jersey, 1997.
    【7】 WITT, S. M., AND YOUNG, S. J., “Phoneme-level Pronunciation Scoring and Assessment for Interactive Language Learning”, Speech Communication 30, 95-108, 2000.
    【8】 CHEN, L. Y., AND JANG, J.S.R., “Automatic Pronunciation Scoring using Learning to Rank and DP-based Score Segmentation”, International Speech Communication Association, 2010.
    【9】 RABINER, L. AND JUANG, B.H., “Fundamentals of Speech Recognition”, Prentice Hall PTR, Upper Saddle River, New Jersey, 1993.
    【10】 HUANG, X., ACERO, A., AND HON, H.W., “Spoken Language Processing, New Jersey”, Prentice Hall, 2001.
    【11】 HIROSE, K., “Accent Type Recognition of Japanese Using Perceived Mora Pitch Values and Its Use Pronunciation Training System”, Graduate School of Frontier Sciences, University of Tokyo, Japan, 2004.
    【12】 CUTLER, A., OTAKE, T., “Pitch Accent in Spoken –Word recognition in Japanese”, Acoustical Society of America, 1999.
    【13】 RABINER, L., “On the use of autocorrelation analysis for pitch detection”, IEEE Transactions on Acoustics, Speech, and Signal Processing , Vol. 25, No. 1, 24-33, 1977
    【14】 CHEN, J.C., AND JANG, J.S.R., “TRUES: Tong Recognition Using Extended Segment”, ACM Transaction on Asian Language Information Processing, 2008.
    【15】 SEIDE, F. AND WANG, N.J.C., “Two-stream modeling of Mandarin tones”, in Proc. of the International Conference on Spoken Language Processing.867-870, 2000
    【16】 YOUNG, S., EVERMANN, G., KERSHAW, D., MOORE, G., ODELL, J., OLLASON, D., VALTCHEV, V., and WOODLAND, P., The HTK (Hidden Markov Model Toolkit) Book V3.2 Cambridge University Engineering Department, 2002.
    http://htk.eng.cam.ac.uk
    【17】 ROSS, M. SHAFFER, H. COHEN, A. FREUDBERG, R. MANLEY, H., 1974. ”Average magnitude difference function pitch extractor,” IEEE Transaction on Acoustics, Speech, and Signal Processing, Vol. 22, No. 5, 353-362, 1974

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE