簡易檢索 / 詳目顯示

研究生: 黃承德
Huang, Cheng-De
論文名稱: 以麥克風陣列及語音預估為基礎的語音增強之研究
Speech Enhancement based on Microphone Array and Speech Estimation
指導教授: 王小川
Wang, Hsiao-Chuan
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 98
語文別: 中文
論文頁數: 47
中文關鍵詞: 語音增強麥克風陣列雜訊預估語音預估
外文關鍵詞: speech enhancement, microphone array, noise estimation, speech estimation
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文前端處理是參考Griffiths-Jim beamformer(GJBF)的架構,再將轉換成單一通道的訊號作後端處理,發展一個能將語音訊號中的背景雜訊抑制的有效方法。首先在固定波束形成(Fixed-beamformer)中採用時間延遲相加波束(Delay-and-Sum beamformer),它是利用訊號在時間軸上的互相關性來預估麥克風間的時間延遲,再進行延遲相加之波束形成,希望能夠消除均值為零的背景雜訊,獲得以聲源為主的訊號,接著將Blocking Matrix改為適應性濾波器產生參考雜訊,以降低聲源訊號殘留在參考雜訊中。因為仍無法完全避免此情形產生,所以加入遞迴平均最小值方法(MCRA)對參考雜訊做進一步的雜訊預估,是以遞迴方式來估計雜訊的頻譜,運用預估局部最小值及搭配類似訊噪比的參數與臨界值來預估各頻帶是否含有語音,根據上述判斷來預估雜訊。分別得到聲源為主和雜訊為主的兩通道,然後用最基本的頻譜刪減法做語音增強。
    經過前端處理之後,得到一個單通道的語音訊號,因為它仍含有雜訊,所以再做一次單通道的語音增強,這個後端處理包含遞迴平均最小值預估法(MCRA)預估雜訊及最佳化修正對數頻譜預估法(OM-LSA)去雜訊處理,OM-LSA是以含雜訊的語音訊號(noisy)為輸入,來估計一個增益函數,將輸入訊號乘上之後,能得到乾淨的語音。
    最後獲得的訊號,利用反傅立業轉換及重疊相加的方法,轉回時域重建訊號波形。


    第 一 章 緒 論.........................................1 1.1 研究背景........................................1 1.2 研究方向........................................1 1.3 章節介紹........................................2 第 二 章 時間延遲預估方法..............................4 2.1 時域的互相關性..................................4 2.2 多通道的互相關性................................5 2.2.1 訊號模型......................................6 2.2.2 用空間預估觀念預估時間延遲....................7 第 三 章 GJBF架構.....................................10 3.1 固定波束形成演算...........................10 3.2 LAF-LAF等適應性濾波器架構產生參考雜訊......11 第 四 章 單一通道語音增強方法.........................16 4.1 雜訊頻譜預估...............................16 4.1.1 時變遞迴平均..........................16 4.1.2 最小頻譜預估..........................18 4.2 語音頻譜預估...............................19 第 五 章 實驗結果和討論...............................24 5.1 實驗環境介紹...............................24 5.2 實驗評估方法...............................26 5.3 實驗結果與討論.............................27 第 六 章 結論.........................................44 參考文獻...............................................45

    [1] Jingdong Chen, Jacob Benesty, Yiteng Huang,“Robust Time Delay Estimation Exploiting Redundancy Among Multiple Microphones,” IEEE Transactions on Speech and Audio Processing, vol. 11, No. 6 , November 2003
    [2] L. J. Griffiths and C. W. Jim, “An alternative approach to linear constrained adaptive beamforming,” IEEE Trans. Antennas Propagat. , vol. AP-30, no. 1, pp. 27-34, Jan., 1982
    [3] Israel Cohen, Baruch Berdugo,“Noise estimation based by minima controlled recursive averaging for robust speech enhancement,” IEEE Signal Processing Letters, vol. 9, No. 1, pp. 12-15, January, 2002
    [4] Y. Ephraim, D. Malah,“Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,”IEEE Trans. Acoustic. , Speech Signal Processing. ASSP-33 (2) (April 1985) 443-445
    [5] M. Omologo, P. Svaizer,“Acoustic event localization using a cross-power spectrum phase based technique,”in Proc.IEEE Int. Conf. Acoustics, Speech, and Signal Processing,vol.2, 1994, pp.273-276
    [6] M. Omologo, P. Svaizer, “Use of the Cross-power spectrum in acoustic event localization,”ISRT Technical Report #9303-13, 1993
    [7] M.Omologo, P.Svaizer,“Acoustic source location in noisy and reverberant environment using CSP analysis,”IEEE Trans Speech and Audio Processing,vol.2, 1996, pp.921-924
    [8] Danial V. Rabinkin, Richard J. Renomeron, Joseph C. French and James L. Flanagan, “Estimation of wavefront arrival delay using the cross-power spectrum phase technique,”132nd Meeting of the Acoustical Society of America, HI, USA, December 4, 1996
    [9] 陳益正,“使用強健性時間延遲與訊號子空間方法於麥克風陣列語音加強”,國立成功大學資訊工程研究所碩士論文,2003
    [10] R. L. Bouquin, A. A. Azirani, and G. Faucon, “Enhancement of Speech Degraded by Coherent and Incoherent Noise Using a Cross-Spectral Estimator,”IEEE Trans Speech and Audio Processing,vol.5,No.5, 1997, pp.484-487
    [11] R. L. Bouquin and G. Faucon, “Using the Coherent Function for Noise Reduction,”IEEE Processings-I,vol.139, 1992, pp.276-280
    [12] C. H. Knapp and G. C. Cater, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-24, pp. 320-327, Aug. 1976
    [13] Jacob Benesty, Jingdog Chen, Yiteng Huang ,“Time-Delay Estimation via Linear Interpolation and Cross Correlation,”IEEE Transactions on Speech and Audio Processing,vol.12, No. 5, pp.509-514
    [14] John P. Ianniello, “Time-Delay Estimation via Cross-Correlation in the Presence of Large Estimation Errors,”IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-30, No. 6, December, 1982
    [15] N. K. Jablon,“Adaptive beamforming with generalized sidelobe canceller in the presence of array imperfections,” IEEE Trans. Antennas Propagat. , pp. 996-1012, Aug., 1986
    [16] O. Hoshuyama and A. Sugiyama, “A robust generalized sidelobe canceller with a blocking matrix using leaky adaptive filters,” Trans. IEICE vol. J79-A, no. 9, pp. 1516-1524, Sep., 1996
    [17] 林宏炬,“使用麥可風陣列與後處理器作噪音之降低”,國立清華大學電機工程學系碩士論文,2007
    [18] S.F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,”in IEEE Trans. Acoustics, Speech Signal Processing, 1979, vol. 27, pp. 113-120
    [19] Ephraim ,Y. , Malah , D. ,”Speech enhancement using a minimum mean square error short-time spectral amplitude estimator”, IEEE Trans. Acoustics, Speech Signal Processing, ASSp-32(6) (December 1984) 1109-1121
    [20] Israel Cohen, Baruch Berdugo,“Speech enhancement for non-stationary noise environments,”Signal Processing 81(2001) 2403-2418
    [21] 王小川編著,“語音訊號處理”,全華科技圖書公司,2005

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE