研究生: |
田明杰 |
---|---|
論文名稱: |
運用自我相關函數過零率於聲帶音與非聲帶音之語音音框分類 Speech and Non-Speech Classification Using Zero Crossing Rate of Autocorrelation Function |
指導教授: |
江永進
|
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2003 |
畢業學年度: | 91 |
語文別: | 中文 |
中文關鍵詞: | 雙面淋雨法 、雙重自我相關 、過零率 、語音音框分類 |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在此篇論文中,我們將介紹兩種新的方法來區別聲帶音(Voiced)與非聲帶音(Unvoiced)。第一種方式為利用淋雨(Rainfall)後的音框過零率(Zero crossing rate)來做為一分類指標。此方式主要可用來從語音訊號音框區分出寂靜(Silence)音框。第二種方式為利用雙重自我相關函數(Double Autocorrelation Function)的過零率來做為一分類指標。此方式主要用來從語音訊號音框中區分出聲帶音框。此雙重自我相關函數過零率較一般過零率有較穩定的優點。而淋雨法的過零率為以另一種形式蘭取代一般所使用的能量特徵量。而此兩特徵量會利用到TIMIT語音
In this thesis, we propose two new features for the classification of a speech frame into a voiced or an unvoiced frame. The first one is the zero crossing rate of a frame after rainfall. This helps to classify a silence frame from speech frames. The second feature is the zero crossing rate of the double autocorrelation function of a frame. This helps to classify a voiced frame from speech frames. The zero crossing rate of the double autocorrelation function is also shown more robust than the ordinary zero crossing rates. The zero crossing rates after rainfall is somewhat a replacement of the feature energy for the classification of speech / non-speech frames. These two features are extensive tested by the TIMIT speech database, and the correct rate is 92.4%.
[01] IAN F. Blake, William C. Lindsey “Level-Crossing Problems for Random Processes“ IEEE Trans Vol.IT-19, No. 3 May 1973
[02] Lawrence R. Rabiner, Michael J. Cheng Aaron E. Rosenberg, Carol A. Mcgonegal, “A Comparative Performance Study of Several Pitch Detection Algorithms” IEEE Trans. Acoust., Speech, and Signal Process., vol. ASSP-24, pp.399-417, Oct. 1976
[03] W. Hess, Pitch Determination of Speech Signals Algorithms and Devices, Springer-Verlag, Berlin Heidelberg New York Tokyo, 1983
[04] Well, B., “Voiced/Unvoiced Decision based on the Bispectrum” Proceedings of 1985 IEEE Int. Conf. On ASSP, pp.1589-1592
[05] Jonathan A. Marks, “Real Time Speech Classification and Pitch Detection” Pretoria Comsig 1988
[06] T. Ghiselli-Crippa and A. El-Jaroudi, “Voiced-Unvoiced-Silence Classification of Speech Using Neural Nets,” Int. Joint Conference on Neural Networks, Seattle, WA, July 1991, pp. II 851-856
[07] Benjamin Kedem, “Time Series Analysis by Higher Order Crossings,” IEEE Press , 1994
[08] Zhou Zhijie, Hu Guangrui “A New Method for the Voiced / Unvoiced Decision Based on Pattern Classification Theory” Proceedings of ICSP pp.710-713 1996
[09] Gordon E. Carlson “Signal and Linear System Analysis.” New York Chichester Weinheim Brisbane Singapore Toronto 1998
[10] Hongtao Hu, Du Limin, “A New Method for Automatic Extraction of the Voiced / Unvoiced Feature from Chinese Continuous Speech Using Wavelet Transform.” Proceedings of ICSP 1998
[11] Lucy Liao, Mark A Gregory, “Algorithms for Speech Classification” RMIT University Melbourne, Victoria 3001, Australia 1999
[12] Mark Greenwood, Andrew Kinghorn “SUVing: Automatic Silence / Unvoiced / Voiced Classification of Speech” The University of Sheffield, UK, 1999
[13] Sassan Ahmadi and Andreas S. Spanias, “Cepstrum-Based Pitch Detection Using A New Statistical V/UV Classification Algorithm.” IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 3, May 1999
[14] Cherif Adnene, “Pitch and Formants Extraction Algorithm for Speech Processing” IEEE, Vol. 1, pp.595-598, 2000
[15] 黃詠詳, “利用以影像細胞單元元素為主的二元主動尋找輪廓模型偵測超音波影像中的腫瘤邊界” 交通大學統計學研究所碩士論文,2001,盧鴻興。
[16] X. Huang, A. Acero, H.-W. Hon “Spoken Language Processing: A Guide to Theory, Algorithm, and System Development” Prentice-Hall 2001
[17] C.K Wang, R.Y. Lyu, Y.C. Chiang, “An Automatic Singing Transcription System with Multilingual Singing Lyric Recognizer and Robust Melody Tracker, ” Proc. of EuroSpeech 2003