研究生: |
洪俊詠 |
---|---|
論文名稱: |
馬可夫語言模型應用di台語變調gah注音 Markov Language Model Applied to Taiwanese Tone Sandhi and Phonetic Annotations |
指導教授: | 江永進 |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 中文 |
論文頁數: | 34 |
中文關鍵詞: | 台語變調 、注音 、馬可夫語言模型 、維特比搜尋 、跨詞界變調 、口語調 |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
台語e變調可以分成單獨詞變調以及整句變調,di單獨詞方面有簡單e變調規則可以套用,但整句e變調dor無簡單。一般語言學ui語法層次去討論,本文主要是以統計觀點切入研究,利用馬可夫語言模型(Markov Language Model)來建模台語佛經e注音gah變調。阮使用新竹智觀寺出版e七卷台語佛經作語料,unigram模型e注音變調正確率大約是80%,bi-gram模型是84%;兩個結果是使用交叉驗證e組外測試結果。
Taiwanese is rich in tone sandhi. It is a two-part problem: when to “tone sandhi,” and how to “tone sandhi.” For multi-syllabic words, major rules exist for both parts of the problems. For a complete sentence or a phrase consisting of multiple words, the tone sandhi rules for word may not apply at the last syllable of each word. Traditional approach to this problem is by the syntactic analysis, and this paper studies the tone sandhi problem by statistical approach. Using as corpora the seven volumes of Buddhist Sutra, published and phonetically annotated in Taiwanese by a senior nun, we model the phonetic transcription by syllable-based Markov language model, and study specifically the tone sandhi problem. A unigram model gives 80% correct and bigram model 84%. Both results are computed using seven-fold cross-validation.
江永進(2005), “台語拼音課程”, 屏東:安可出版社.
邱玉雪(2004), “台灣閩南語偏正結構詞組中的變調分界”, 新竹師範學院碩士文.
劉惠玫(2000), “用TTS輔助台語語料之處理”, 清華大學碩士論文.
劉亦真(2005), “建立T3剖析樹語料庫:台語部分”, 清華大學碩士論文.
釋達觀(2005), “E世代佛點羅馬拼音台語版—智觀之音”, 新竹:智觀寺.
Chiang, Yuang-Chin, Min-Siong Liang, Hong-Yi Lin, Ren-Yuan Lyu (2005), “A Bi-lingual Mandarin-To-Taiwanese Text-to-Speech Translational System”, The 9th European Conference on Speech Communication and Technology, Sep. 4-8, Lisboa, Portugal, 2005.