聽障者之語音增強與轉換｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	葉向林 Hsiang-Lin Yeh
論文名稱：	聽障者之語音增強與轉換 Speech Enhancement and Conversion for Hearing-Impaired
指導教授：	王小川 Hsiao-Chuan Wang
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2004
畢業學年度：	92
語文別：	中文
論文頁數：	49
中文關鍵詞：	聽障者、語音增強、語音轉換
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

聽障者在噪音環境下對語音的理解能力遠比一般人差，要在噪音環境下提升聽障者對語音訊號的理解力，首先要做的是將語音增強，亦即作噪音去除，本論文使用了具聽覺遮蔽效應之概念來做語音增強。當噪音去除之後，使用了等效矩形頻寬的分頻法(equivalent rectangular bandwidth scale)，搭配上圓通化指數(rounded exponential)模型，將增強後的語音轉換，以期在噪音環境下對聽障者提供最大之幫助。一個語音訊號如果被雜訊所干擾，對一般人來說，會產生辨認率的下降及耳朵的不舒服的感覺，對聽障者來說更為嚴重。所以我們在做語音增強的時候，第一步要做的就是將帶有雜訊的語音訊號中的雜訊去除，語音增強的處理，運用到了聽覺遮蔽效應的概念，典型的方式就是將語音訊號的功率頻譜(power spectrum)乘上一增益函數(Gain function) 。聽障者對於聽聲音有一定的困難度，不論是聽不清楚或聽不到，會導致這樣的結果一定是有其原因的。對一個需要配戴助聽器的人來說，一定要讓助聽器跟聽障者的聽力特性達到一個最佳匹配，才能得到戴助聽器的效果。本研究期望能找到讓聽障者聽不清楚或聽不到聲音的原因，針對這個原因，去做語音訊號上的處理，讓語音訊號跟聽障者的聽力特性達到最佳的配合效果。每個聽障者的情況不盡相同，本研究只是針對一般聽障者會有的情形，對訊號做適當的轉換。一般語音訊號在頻譜上變化迅速，聽障者可能無法完全接收或理解，而且比較聽不到高頻的成份，所以第二步驟的處理是將語音訊號的低頻變化及變動幅度降低，且提高高頻的頻譜值，以符合聽障者的需要。

The speech comprehension for a hand-of-hearing person is far worse than a hearing-normal people under the noisy environment. In order to improve the speech perception under the noisy environment,the speech enhancement,namely the noise reduction,must be performed.This thesis employs the masking effect to the speech enhancement. After the noise is reduced,we use the equivalent rectangular bandwidth (ERB)scale and matching rounded exponential(ROEX) model to convert the speech in frequency domain.By doing so,we may help those hand-of- hearing persons in improving their speech perception.

目錄
第一章   緒論...........................................1
   1.1    研究動機........................................1
   1.2    研究方向........................................3
   1.3    章節概要........................................4
第二章   噪音環境下的語音增強方法....................6
   2.1    傳統方法簡介....................................6
      2.1.1   頻譜刪減法..................................7
      2.1.2   韋納濾波器..................................8
   2.2    利用聽覺特性的語音增強方法......................9
第三章   聽覺遮蔽門檻值估計..........................12
   3.1    聽覺遮蔽效應...................................12
   3.2    聽覺遮蔽門檻值之估計...........................16
      3.2.1   臨界頻帶之分析.............................17
3.2.2   與延展函式的摺合積分(convolution)..........18
      3.2.3   門檻值之估計...............................18
      3.2.4   門檻值正規化(normalization)之處理..........20
第四章   具聽覺效應的語音增強系統...................23
   4.1    噪音的頻譜值估計...............................23
   4.2    vb、ab值對增益函數的影響........................24
   4.3    ab值的推導過程.................................25
4.4    還原語音訊號...................................28
   4.5    將訊號從頻域轉回時域...........................29
第五章   適合於聽障者之語音轉換 ....................30
   5.1    聽障者的聽覺現象及特性.........................30
   5.2    處理流程.......................................31
   5.3    圓通化指數模型.................................31
   5.4    激發樣型之計算.................................34
第六章   實驗結果與討論...............................38
   6.1    實驗環境介紹...................................38
   6.2    SNR值及Itakura-Saito distance值...............38
   6.3    實驗結果.......................................39
   6.4    討論...........................................43
第七章   結論與未來展望...............................46
參考文獻................................................47






圖目錄
圖一、頻譜刪減法流程圖..............................................8
圖二、Iterative Wiener filter流程圖.................................9
圖三、系統方塊圖....................................................11
圖四、CB與ERB的頻帶個數與頻率之關係...............................13
圖五、AMT的計算流程................................................16
圖六、延展函式.....................................................18
圖七、絕對聽覺門檻曲線圖............................................21
圖八、乾淨語音其中一個音框的功率頻譜與對應之AMT....................21
圖九、含雜訊語音(工廠噪音0dB)的功率頻譜與對應之AMT................22
圖十、瞬間SNR值跟Gain值的關係圖..................................24
圖十一、模糊頻譜的示意圖...........................................31
圖十二、中心頻率為1000Hz的聽覺濾波器形狀(正常聽者)................33
圖十三、中心頻率為1000Hz的聽覺濾波器形狀(聽障者 ).................33
圖十四、激發樣型(正常聽者)示意圖...................................35
圖十五、(上)乾淨語音功率頻譜.......................................36
        (中)激發樣型 Y.............................................36
        (下)修正後的功率頻譜 Yc....................................36
圖十六、(i)~(iii)分別為乾淨語音訊號、帶有雜訊的語音訊號、增強後的語音
訊號.......................................................43
圖十七、圖十六(i)~(iii)其中一個音框分別對應的頻譜圖..................44
圖十八、(i)~(ii)分別為增強後的語音訊號及轉換成適合於聽障者之語音訊號.45
圖十九、圖十八(i)~(iii)其中一個音框分別對應的頻譜圖..................45















表目錄
表一、CB的中心及邊界頻率及對應的FFT點數列表........................14
表二、ERB的中心及邊界頻率及對應的FFT點數列表.......................15
表三、含雜訊語音的IS distance值....................................39
表四、疊代次數0、vb(i)=1改善的SNR值..................................40
表五、疊代次數1、vb(i)=1改善的SNR值..................................40
表六、疊代次數2、vb(i)=1改善的SNR值..................................40
表七、疊代次數0、vb(i)=2改善的SNR值..................................40
表八、疊代次數1、vb(i)=2改善的SNR值..................................41
表九、疊代次數2、vb(i)=2改善的SNR值..................................41
表十、疊代次數0、vb(i)=1的IS distance值..............................41
表十一、疊代次數1、vb(i)=1的IS distance值............................41
表十二、疊代次數2、vb(i)=1的IS distance值............................42
表十三、疊代次數0、vb(i)=2的IS distance值............................42
表十四、疊代次數1、vb(i)=2的IS distance值............................42
表十五、疊代次數2、vb(i)=2的IS distance值............................42

                                

[1] 謝逸博,“以語音合成技術發展聽障者語言學習輔助系統,”
國立清華大學電機工程研究所碩士論文,2001
[2] Whimal,N.A.,Rutledge,J.C.,“Noise reduction algorithm for digital hearing aids,”IEEE ,3-6 Nov.1994,pp.1294-
1295
[3] Moore,B.C.J.,“Speech processing for the hearing- impaired:successes,failures,and implications for speech mechanisms,” Speech Commun.Vol 41,81-91,2003
[4] Jonhston,J.D.,“Transform coding of audio signal using perceptual noise criteria,”IEEE J.Select.Areas Commum.,
Vol.6,pp.314-323,Feb.1988
[5] Tsoukalas,D.E.Mourjoupoulos,J.and Kokkinakis,G. “Speech enhancement based on audible noise suppression,”
IEEE Trans.Speech & Audio Proc.,5(6):497-514,1997
[6] Patterson,R.D.,Nimmo-Smith,I.,Weber,D.L.,andMilroy,R.
(1982).“The deterioration of hearing with age:Frequency selectivity,the critical ratio,the audiogram and speech threshold,”J.Acoust.Soc.Amer.,vol.72,1788-1803
[7] Deller,Jr.J.R.,Hansen.J.H.L.,Proakis,J.G.“Discrete-
Time Procesng of Speech Signals,”An IEEE PRESS Classic
Reissue 1993
[8] Steven,F.B.,“Suppression of Acoustic Noise in Speech Using Spectral Subtraction,”IEEE Trans.on Acoustics, Speech,and Signal Processing,Vol.ASSP-27,No.2,pp.113-
120,April 1979
[9] Arehart,K.H.,Hansen,J.H.L.,Gallant,S.and Kalstein,L. “Evaluation of An auditory masked threshold noise
suppression algorithm in normal-hearing and hearing-
impaired listeners,”Speech Commun.,Vol 40(4):575-
592 June 2003
[10] Baer,T.,and Moore,B.C.J.(1993). “Effects of spectral smearing on the intelligibility of sentences in the presence of noise,”J.Acoust.Soc.Amer.94,1229-1241.
[11] http://www.ling.su.se/staff/hartmut/bark.htm
[12] Fillon,T.,Prado,J. “Evaluation of an ERB frequency scale noise reduction for hearing aids:A comparative study,” Speech Commun.Vol 39,23-32,2003
[13] Virag,N. “Speech Enhancement Based on Masking Properties of the Auditory System,”IEEE ICASSP, Detroit,MI,pp.796-799,May 1995
[14] Virag,N. ”Single Channel Speech Enhancement Based on Masking Properties of the Human Auditory System,”IEEE Trans.On Speech and Audio Proc,vol.7,no.2,pp.126-137
,March 1999
[15] M.R.Schroeder,B.S.Atal,and J.L.Hall, “Optimizing digital speech coders by exploiting masking properties of the human ear,”J.Acoust. Soc.Amer.,vol.66,pp.1647-1652, Dec.1979
[16] 林輝彥, “應用具聽覺效應之模型於噪音環境中語音辨識,”國立成功大學資訊工程研究所碩士論文,2001
[17] Doblinger,G.“Computationally efficient speech enhancement by spectral minima tracking in subbands,”
EUROSPEECH 1995,pp.1513-1516.
[18] Thomas,F.Q.,“Discrete-Time Speech Signal Processing,
”PRINCIPLES AND PRACTICE 2002
[19] Nejime,Y.,Moore,B.C.J.,(1997). “Simulation of the effect of threshold elevation and loudness recruitment combined with reduced frequency selectivity on the intelligibility of speech in noise,” J.Acoust.Soc.Amer.102,603-615
[20] Natarajan,A.,Hansen,J.H.L.,Arehart,K.,Rossi-Katz,J.A. “Perceptual based speech enhancement for normal-hearing
& hearing-impaired individuals,”Eurospeech 2003,pp
1425-1428
[21] http://www. indiana.edu/~spl/people/shgoodma/code/
expattern.html
[22] Moore,B.C.J.,and Glasberg,B.R.(1983). “Suggested formulae for calculating auditory-filter bandwidths and exciation patterns,”J.Acoust.Soc.Amer.74,750-753.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文