簡易檢索 / 詳目顯示

研究生: 顏嘉佑
Yeng, Chia-Yo
論文名稱: 在時頻空間以二階段法作盲音源分離
Two-stage Method for Blind Source Separation in Time-Frequency Domain
指導教授: 王小川
Wang, Hsiao-Chuan
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 54
中文關鍵詞: 盲訊號分離
外文關鍵詞: BSS
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文主要探討摺積混合下盲音源分離的演算法,希望能解決實際環境下,語音訊號處理中所描述的雞尾酒派對問題。本論文利用相關性來量測獨立性,由於相關性在統計學上是屬於二階的統計特性,表現此統計特性的方式為一個對稱的方陣,稱之為共變異矩陣。實際運算時先將訊號轉至頻域,接著計算訊號的交頻譜來表現語音訊號的二階統計特性。利用聯合對角化演算法對每個離散頻率計算解混合矩陣,使分離出來的訊號能夠盡可能的不相關。為了能有較佳的分離效果,我們利用語音訊號在時頻域上有稀疏性的假設,估計出某時頻點應該由哪位說話人獨占,且利用共變異矩陣的特徵值來近似原說話人與干擾訊號在此點上的能量比。接著,利用此比值建立一組遮罩,將以不相關的訊號通過這組遮罩來更加壓抑干擾訊號。為了避免遮罩產生分離訊號之頻譜的不連續性,我們將此遮罩轉換至倒頻域,在低倒頻率的部分用較小的平滑係數處理,藉此保持分離訊號的諧振,在高倒頻率的部分用較大的係數平滑之,讓分離出來的訊號之頻譜不會過於不連續。實驗時嘗試將兩個麥克風在實際環境中錄到的雙人混合語音分離開,將只有不相關的訊號和通過遮罩後的加強訊號作主觀評量,發現透過遮罩的確可以使干擾訊號更加的被壓抑。


    1 Introduction 1 2 Independent component analysis 3 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Pre-Processing of ICA . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Criteria for Independency . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.1 Quantitative Measures for Uncorrelation . . . . . . . . . . . . . . . 8 2.3.2 Quantitative Measures for Non-Gaussianity . . . . . . . . . . . . . . 9 2.4 Algorithms of Optimization . . . . . . . . . . . . . . . . . . . . . . . 10 3 Model and Ambiguities of BSS 11 3.1 Instantaneous Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.1 Eliminating Cross Correlation in Instantaneous Model . . . . . . . . . 13 3.2 Convolutive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.1 short time Fourier transform for speech signals . . . . . . . . . . . . 15 3.3 Method for Dealing with Permutaion Ambiguity . . . . . . . . . . . . 18 3.3.1 DOA approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.2 Approach of relative measure . . . . . . . . . . . . . . . . . . . . . 19 3.4 Method for Dilation Ambiguity . . . . . . . . . . . . . . . . . . . . . 21 4 Joint Diagonalization Algorithm 23 4.1 Overview of JADIAG . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2 Diagonalize two Hermitian Matrices of order two . . . . . . . . . . . . 26 4.3 Termination Condition . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 Post-Processing 29 5.1 Post-Processing by Masking . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Method for Estimating Energy of Each Source . . . . . . . . . . . . . 29 V 5.3 Mask Smoothing in the Cepstral Domain . . . . . . . . . . . . . . . . 31 5.4 Mask Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.5 Post-Permutaion Process . . . . . . . . . . . . . . . . . . . . . . . . . 34 6 Experiment 36 6.1 Two-stage BSS Method in this Thesis . . . . . . . . . . . . . . . . . . 36 6.2 Real-room Recorded Mixing Signals . . . . . . . . . . . . . . . . . . . 36 6.3 Assumption of Uncorrelation . . . . . . . . . . . . . . . . . . . . . . . 37 6.4 Performance Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.4.1 Log Spectral Distance . . . . . . . . . . . . . . . . . . . . . . . . 39 6.4.2 Mean Opinion Score . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.5 Demonstrate the Mask Pattern . . . . . . . . . . . . . . . . . . . . . 43 6.6 Demonstration of Post-Permutaion Process . . . . . . . . . . . . . . . 43 6.7 Experiment Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7 Summary and Future Work 52

    [1] A.Ziehe, P.Laskov, K.R.Mueller, and G.Nolte, \A fast algorithm for joint diagonalization
    with non-orthogonal transformations and its application to blind
    source separation," Journal of Machine Learning Research 5, pp. 777{800, 2004.
    [2] R. Vollgraf and K. Obermayer, \Quadratic optimization for simultaneous matrix
    diagonalization," IEEE Trans. Signal Process., vol. 54, pp. 3270{3278.
    [3] D. Pham, \Joint approximate diagonalization of positive de nite hermitian matrices,"
    SIAM J. on Matrix Anal. and Appl, vol. 22, pp. 1136{1152, 2000.
    [4] A. Jourjine, S. Rickard, and O. Yzlmaz, \Blind separation of disjoint orthogonal
    signals: Demixing n sources from 2 mixtures," Proc. IEEE ICASSP, pp. 2985{
    2988, 2000.
    [5] O.Yilmaz, A.Jourjine, and S.Rickard, \Blind separation of speech mixtures via
    time-frequency masking," Proc. IEEE TSP, vol. 52, 2004.
    [6] K. Pearson, \On lines and planes of closest t to systems of points in space,"
    Philosophical Magazine 2, vol. 6, p. 559572, 1901.
    [7] P. Comon, \Independent component analysis, a new concept?," Signal Process-
    ing, vol. 36, pp. 287{314, 1994.
    [8] A. J. Bell and T. J. Sejnowski, \An information-maximization approach to blind
    separation and blind deconvolution," Neural Computation, vol. 7, pp. 1129{
    1159, 1995.
    [9] S.Amari, A.Cichocki, and H.H.Yang, \A new learning algorithm for blind signal
    separation," Advance in Neural Infomation Processing Systems, vol. 8, pp. 757{
    763, 1996.
    [10] A. Hyvrinen, \Fast and robust xed-point algorithms for independent component
    analysis," IEEE Trans. on Neural Networks, vol. 10(3), pp. 626{634, 1999.
    53
    [11] A.Blin, S.Araki, and S.Makino, \Underdetermined blind separation of convolutive
    mixtures of speech using time-frequency mask and mixing matrix estimation,"
    IEICE Trans. Fundamentals, vol. E88-A, pp. 1693{1700, 2005.
    [12] S.Winter, H.Sawada, S.Araki, and S.Makino, \Overcomplete bss for convolutive
    mixtures based on hierarchical clustering," Proc.ICA2004, pp. 652{660, 2004.
    [13] J. Cardoso and A. Souloumiac, \Blind beamforming for non-gaussian signals,"
    IEE Proceedings-F, vol. 140, no. 3, pp. 362{370, 1993.
    [14] P. Comon, \Tensor diagonalization, a useful tool in signal processing,," Proc.
    10th IFAC Symp. Syst. Iden, vol. 1, pp. 77{82, 1994.
    [15] S. Choi, S. ichi Amari, A. Cichocki, and R. wen Liu, \Natural gradient learning
    with a nonholonomic constraint for blind deconvolution of multiple channels,"
    Proceedings of the International Workshop on Independent Component
    and Blind Souece Separation(ICA'99), pp. 371{376, 1999.
    [16] K. Hao Shen Huper, \Newton-like methods for parallel independent component
    analysis," Proceedings of the 2006 16th IEEE Signal Processing Society
    Workshop on Machine Learning for Signal Processing, pp. 283{288, 2006.
    [17] R. Mukai, H. Sawada, S. Araki, and S. Makino, \Real-time blind source separation
    and doa estimation using small 3-d microphone array," Proc. WAS-
    PAA2005, pp. 9{12, 2005.
    [18] D. T. Pham, \Blind separation of convolutive audio mixtures using nonstationarity,"
    Proceedings of ICA 2003 conference, pp. 975{980, 2003.
    [19] K. Matsnoka, \Minimal distortion pirinciple for blind source separation," Pro-
    ceedings of the 41st SICE Annual Conference,, vol. 4, pp. 2138{2143, 2002.
    [20] N.Madhu, C.Breithaupt, and R.Martin, \Temporal smoothing of spectral masks
    in the cepstral domain for speech separation," ICASSP2008, pp. 45{48, 2008.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE