研究生: |
陳昭熙 Chen, Zhao-Xi |
---|---|
論文名稱: |
基於聯合近似對角化之即時語音分離系統 A Real-time Speech Separation System based on Joint Approximate Diagonlization Algorithm |
指導教授: |
王小川
Wang, Hsiao-Chuan |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2010 |
畢業學年度: | 98 |
語文別: | 中文 |
論文頁數: | 49 |
中文關鍵詞: | 盲訊號分離 、獨立成份分析 、聯合對角化 、即時系統 |
外文關鍵詞: | blind source separation ,BSS, Independent Component Analysis, ICA, Joint Approximate Diagonlization, real-time system |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主要探討的問題為真實環境下的即時分離摺積混合盲聲源之研究。在一般環境中的聲源混合過程,因混合通道具有記憶性,混合過程為摺積混合,其運算量遠大於瞬時混合,所以處理摺積混合的即時系統文獻是比較少見的,也是可以說專門為了分離語音訊號的討論。為了使系統有較佳的效率,我們採用了在頻域執行盲聲源分離,但伴隨發生之不確定因素也是本篇討論的重點。運用獨立成份分析演算法達成盲聲源分離,核心演算法為聯合對角化演算法。其中聯合近似對角化演算法其利用二階統計的特性使得分離後訊號有最大的獨立性值,應用此方法解決盲訊號分離問題並不用對混合訊號做集中化和白色化的前處理,能避免訊號因前處理而造成統計特性的改變。在即時盲聲源分離系統,系統的輸入僅是少量的語音片段,如何利用少輸入訊求出準確之分離資訊也是本篇論文的重點。我們應用線上處理的架構來實現即時分離系統,也利用線上處理的架構,設計出以遞迴架構解決排列問題。利用Simulink來實現即時系統,實驗時利用兩支麥克風在真實環境分離兩位語者的混合聲音,立用錄音界面將兩個麥克風訊號輸入電腦計算分離資訊,並立用累加分離資訊來得到即時分離的音檔,語音分離的成果也可接受和利用批次處理的系統並無太大差別。
Real-time blind source separation (BSS) is a technique to recover independent sources from the mixed signals in online system without any prior knowledge of the sources and the mixing channels. This thesis is a study of BSS problem for speech signals recorded in real environment. The research can be divided into three parts. One is to decorrelate mixing signals in time-frequency domain, which use covariance matrix to measure the independence components. The ideal has to derive the algorithm that we can get the clean speech from different channel. The other is to use that characteristic of human hearing. By this way, we can change the algorithm by adding more parameter that can make the algorithm speed the computing time via low complexity, so we can went the system be implemented at real-time system. Obviously, it is not suitable in realistic environment implementation.
Therefore, last is to make our system become an on-line algorithm by using speech signals are non-stationary in time domain. The processing runs accurately learning optimal values in the complicated space for these delays and attenuations with the previous and moment input. Our BSS system for acoustic source separation can implement in a realistic environment and the result show it has better performance with decreased computational complexity.
[1] Te-Won Lee and Gil-Jin Jang. “The statistical structures of male and female speech signals.” ICASSP'01 Vol:1, Page:105-108 (2001)
[2] A.Hyvärinen and E. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Comput., vol. 9, pp. 1483–1492, (1997).
[3] Pau Bofill and Enric Monte. “Underdetermined convoluted source reconstruction using LP and SOCP, and a neural approximator of the optimizer.” ICA (2006)
[4] Jean-Francois Cardoso, “SOURCE SEPARATION USING HIGHER ORDER MOMENTS”, Icassp’89. pp. 2109-2112 (1998)
[5] A.J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution”, Neural Comput., vol. 7,pp. 1129–1159, (1995).
[6] J.F. Cardoso and A. Souloumiac,“Blind beamforming for non Gaussian signals.” IEE Proceedings-F, Vol:140, Page:362-370, (1993).
[7] Scott Rickard, Radu Balan, Justinian Rosca, “REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION”, ICA (2001).
[8] Shiro Ikeda and Noboru Murata, “A method of ICA in the frequency”, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.4540
[9] Noboru Murata , Shiro Ikeda and Andreas Ziehe , “An approach to blind source separation based on temporal structure of speech signals ” Neurocomputing, vol. 41, no. 1–4, pp. 1–24, (2001).
[10] K. Torkkola, “Blind separation of convolved sources based on information maximization,” in Proc. IEEE Workshop on Neural Networks and Signal Processing, pp. 423–432,(1996).
[11] S. Ding, M. Otsuka, M. Ashizawa, T. Niitsuma, and K. Sugai, “Blind source separation of real-world acoustic signals based on ICA in timefrequency domain,” IEICE, Tech. Rep. IEICE, pp. 1–8, vol. EA2001-1,(2001).
[12] M. Joho and H. Mathis, “Joint diagonalization of correlation matrices by using gradient methods with application to blind signal separation,” in Proc. SAM, pp. 273–277, (2002).
[13] H. Attias and C. E. Schreiner “ Blind Source Separation and Deconvolution: The Dynamic Component Analysis Algorithm,” Neural Computation 10, 1373–1424, (1998).
[14] Zhan-Li Sun, De-Shuang Huang, Chun-Hou Zheng, Li Shang “Using batch algorithm for kernel blind source separation” Neurocomputing 69273–278, (2005)
[15] Shuxue Ding, Jie Huang, Daming Wei and Andrzej Cichocki, “A Near Real-Time Approach for Convolutive Blind Source Separation,” IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 1, JANUARY (2006).
[16] Ryo Mukai Hiroshi Sawada Shoko Araki Shoji Makino, “REAL-TIME BLIND SOURCE SEPARATION AND DOA ESTIMATION USING SMALL 3-D MICROPHONE ARRAY,”
[17] J. Herault and C. Jutten, “Space or time adaptive signal processing by neural network models,” in Proc. AIP Conf., J. S. Denker, Ed., Snowbird,UT, pp. 206–211,(1986).
[18] A. Hyvärinen and E. Oja. “Independent component analysis: algorithms and applications.” Neural Networks, Vol:13, Page:411-430, (2000).
[19] Jean-François Cardoso , “EIGEN-STRUCTURE OF THE FOURTH-ORDER CUMULANT TENSOR WITH APPLICATION TO THE BLIND SOURCE SEPARATION PROBLEM,” ICASSP-90,(1990).
[20] Jean-François Cardoso Télécom , “ITERATIVE TECHNIQUES FOR BLIND SOURCE SEPARATION USING ONLY FOURTH-ORDER CUMULANTS,”EUSIPCO ’(1992)
[21] S. C. Douglas and X. Sun, “Convolutive blind separation of speech mixtures using the natural gradient,” Speech Commun., vol. 39, pp. 65–78, (2003).
[22] Michael Zibulevsky , “BLIND SOURCE SEPARATION WITH RELATIVE NEWTON METHOD,” ICA, , Nara, Japan (2003).
[23] ELLA BINGHAM and AAPO HYVARINEN , “A FAST FIXED-POINT ALGORITHM FOR INDEPENDENT COMPONENT ANALYSIS OF COMPLEX VALUED SIGNALS,” International Journal of Neural Systems, Vol. 10, No.1, P.1-8 (2000).
[24] B. Vikrham Gowreesunker and Ahmed H. Tewfik “BLIND SOURCE SEPARATION USING MONOCHANNEL OVERCOMPLETE DICTIONARIES ,” ICASSP (2008)
[25] Wei Liu and Danilo P. Mandic , “SEMI–BLIND SOURCE SEPARATION FOR CONVOLUTIVE MIXTURES BASED ON FREQUENCY INVARIANT TRANSFORMATION,” ICASSP (2005).
[26] Hoang-Lan Nguyen Thi”T and Christian Juttenav , “Blind source separation for convolutive mixtures ,”Signal Processing 45209-229, (1995).
[27] A. Yeredor, “On using exact joint diagonalization for noniterative approximate joint diagonalization,” IEEE Signal Processing Lett., vol. 12, no. 9, pp. 645–2005,(2005).
[28] D. T. Pham. “Joint approximate diagonalization of positive definite hermitian matrices.” SIAM Journal on Matrix Analysis and Applications, Vol:22, No:4, Page:1136-1152, (2000).
[29] B. Afsari, “Sensitivity analysis for the problem of matrix joint diagonalization,” SIAM J. Matrix Anal. Appl., vol. 30, no. 3, pp. 1148–1171, (2008).
[30] 陳世勛, “針對摺積混合的加速聯合近似對角化盲訊號分離方法,”P.29-32,(2008).
[31] Thomas F.Qiatieri,“Discrete-Time Speech Signal Processing,”P325-328,(2002).
[32]B. N. Flury, Common principal components in k groups, J. Amer. Statist. Assoc., 79 (1984), pp. 892–897
[33] Noboru Murata, Shiro ikeda, and Andreas Ziehe. “An approach to blind source separation based on temporal structure of speech signals.” Proceedings of 1998 International Conference on Artificial Neural Networks (ICANN' 98), Skovde, Sepetember (1998).
[34] D.T. Pham, Ch. Servière, and H. Boumaraf. “Blind separation of convolutive audio mixtures using nonstationarity.” ICA 03, Nara, Japan, April 1-4, Page:975-980, (2003).
[35] K. Matsuoka. “Minimal distortion principle for blind source separation.” Proceedings of the 41st SICE Annual Conference, August 5-7,. Vol:4, Page:2138-2143, (2002).