研究生: |
張展嘉 Chan-Chia Chang |
---|---|
論文名稱: |
自由音節解碼在全文資訊檢索及語句辨識之應用 Using Free Syllable Decoding on Full-text Information Retrieval and Sentence Recognition |
指導教授: |
張智星
Jyh-Shing Roger Jang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2002 |
畢業學年度: | 90 |
語文別: | 中文 |
論文頁數: | 33 |
中文關鍵詞: | 自由音節解碼 、語音辨識 |
外文關鍵詞: | free syllable decoding, speech recognition |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主要在討論如何將不精確的音節辨識器,應用在全文資訊檢索以及語句辨識上。此音節辨識器做的是「沒有音調的音節辨識」,並且沒有使用文法結構來調整辨識出來的音節字串。
若將此音節辨識器拿來應用,要面對的困難有二,而本論文也提出相對應的解決方法:
第一、音節容易辨識錯誤。
雖然音節辨識錯誤,但是辨識錯的音節仍有其參考價值,因此我們制定了「音節相似程度表」 描述音節之間的關聯,以提高系統容忍度。
第二、解碼出來的音節個數不如預期。
也就是會產生「刪除性錯誤」以及「嵌入性錯誤」。對於此問題的解決方案,因應用不同而異。在語句辨識上我們使用「動態時間扭曲法」,使得字串的比對更具彈性﹔在全文資訊檢索上,則使用類似交集的方法並搭配權重來計算各個文件的分數。
In this thesis, we discuss the methods for constructing a Mandarin syllable-recognizer which can be used for diverse applications such as "full-text information retrieval" and "sentence recognition". The syllable-recognizer performs recognition tasks without using tone information and language models, leading to a low recognition rate at the syllable level. Hence we identify the associated problems and propose methods to deal these problems.
The first problem is the low recognition rate at the syllable-level. The identified syllable might not be correct, but it bears similarity to the intended syllable. Therefore, we establish a "syllable-similarity table" to describe the similarity between any two syllables, and the similarity scores are used in ranking the possible output. The system's performance becomes more robust after adding this enhancement.
Another problem is the number of syllables this recognizer decoded may not be correct. The most common errors are "deletion error" and "insertion error". Different applications call for different strategies to deal with the problem. In sentence recognition, we apply the concept of "dynamic time warping" to make the string-matching process more flexible. In full-text information retrieval, we use the methods of "syllable intersection" and "syllable weights" to evaluate the score of each retrieved document.
[1] Lawrence Rabiner, Biing-Hwang. ” Fundamentals of speech recognition”,PREntice Hall Inc.,New Jersey, 1993.
[2] Gu, H. Y., C. Y. Tseng and L. S. Lee, "Markov Modeling of Mandarin Chinese for Decoding the Phonetic Sequence into Chinese Characters", Computer Speech and Language, Vol. 5, No. 4, pp. 363-377, (1991).
[3] H. Ney, S. Ortmanns. "Progress in Dynamic Programming Search for LVCSR". IEEE Workshop on Automatic Speech Recognition and Understanding, Sta. Barbara, CA, pp. 287-294, Dez. 1997.
[4] Berlin Chen, Hsin-min Wang, Lin-shan Lee, "Retrieval of Broadcast News Speech in Mandarin Chinese Collected in Taiwan Using Syllable-level Statistical haracteristics", IEEE International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, June 2000, SP-P9.14, pp. III-1771-1774.
[5] Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest, “Introduction to Algorithm”, seventeenth printing, 1996. p.314-320
[6] Stuart E. Dreyfus, and Averill M. Law, “The Art and Theory of Dynamic Programming”,New York:Academic Press, 1977. (DTW)
[7] WITTEN, I. H., MOFFAT, A., AND BELL, T. C. Managing Gigabytes: Compressing and Indexing Documents and Images, second ed. Morgan Kaufmann, 1999.