研究生: |
郭致廷 Chih-Ting Kuo |
---|---|
論文名稱: |
可應用於VoIP封包遺失隱蔽及語音修改的低複雜度架構 A Low-Complexity Structure for VoIP Packet-Loss Concealment and Speech Modification |
指導教授: |
王小川
Hsiao-Chuan Wang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2008 |
畢業學年度: | 96 |
語文別: | 中文 |
論文頁數: | 67 |
中文關鍵詞: | 網路電話 、封包遺失隱蔽 、語音修改 、弦波模型 |
外文關鍵詞: | VoIP, Packet Loss Concealment, Speech Modification, Sinusoidal Model |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
VoIP是近年來受到重視的語音通訊技術之一,語音編碼技術至今已有相當程度的發展,但VoIP利用網路傳輸語音資料,而網路的傳輸效率及可靠性並不如傳統電話線路,通訊時如何在封包遺失的情況下恢復一定程度的語音品質,便成為VoIP重要的課題之一。
現行VoIP技術所使用的語音編碼技術,以碼本激發線性預估編碼為主流,而封包遺失隱蔽技術,大多是透過線性預估編碼的方式,嘗詴回復語音的週期訊號。由於碼本激發線性預估編碼及現行的封包遺失隱蔽技術演算複雜度偏高,若加上網際網路的傳輸延遲,編解碼器整體的延遲時間將拉得更長,容易造成聽者感受舒適度的下降。
為了解決上述的問題,本文將語音訊號依照音高週期切割,以音高週期為單位傳輸語音資訊,並使用一低複雜度的弦波編碼描述語音波形,不但可降低編解碼時的計算複雜度,進行封包遺失隱蔽時,亦只需利用重複封包的方式即可達到相當不錯的回復效果,也能視網路傳輸的壅塞與否,主動彈性調整傳輸時的資料量,降低網路變得更壅塞的可能性及改善封包遺失率。
此外,本文的架構搭配線性預估編碼後,亦可應用在語音修改方面,諸如放慢或加快語音的速度,以及調整語音的音高及聲調等等,使得語音修改亦能在簡易且低複雜度的架構下完成各種修改動作。
[1] Miniwatts Marketing Group. (2008, Mar.) Internet Usage World Stats - Internet and Population Statistics. [Online]. http://www.internetworldstats.com/stats.htm
[2] eBay Inc., "First Quarter 2008 Results," 2008.
[3] J. Mercier. (2008, Feb.) Skype Numerology. [Online]. http://skypenumerology.blogspot.com/2008/02/12-million-in-record-time.html
[4] H. Manhaiem, R. Silon, G. Fartuk, and S. Refael. (2004) RAD Data Communications. [Online]. http://www.raduniversity.com/2004/packet_loss_concealment/main.htm
[5] Network Working Group. (2004, Dec.) Internet Low Bit Rate Codec (iLBC).
[6] Global IP Sound. (2004, Oct.) iLBC White Paper.
[7] 王小川, 語音訊號處理. 全華科技圖書股份有限公司, 2004.
[8] P. Vary. IND: Speech and Audio Processing. [Online]. http://www.ind.rwth-aachen.de/en/research/speech-and-audio-processing/speech-and-audio-coding/principles/
[9] N. Ramadass, G. M. A. Ibrahim, S. Natarajan, and J. R. P. Perinbam, "A Novel Architecture For Modified Algebraic Code Book Search," in Proceedings of the International Conference "Mixed Design of Integrated Circuits and System", Gdynia, 2006, pp. 207-209.
[10] A. d. Cheveign□ and H. Kawahara, "YIN, a fundamental frequency estimator for speech and music," The Journal of the Acoustical Society of America, vol. 111, no. 4, pp. 1917-1930, Apr. 2002.
[11] S. G. Mallat and Z.-F. Zhang, "Matching pursuits with time-frequency dictionaries," IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 3397-3415, Dec. 1993.
[12] J.-M. Valin, D. V. Smith, C. Montgomery, and T. B. Terriberry, "Low-Complexity Iterative Sinusoidal Parameter Estimation," Proc. International Conference on Signal Processing and Communication Systems (ICSPCS), pp. 276-283, 2007.
[13] C. G. Broyden, "On Convergence Criteria for the Method of Successive Over-Relaxation," Mathematics of Computation, vol. 18, no. 85, pp. 136-141, Jan. 1964.
[14] W. Bailey, "The Successive Over-Relaxation (S.O.R.) Algorithm & its Application to Numerical Solutions of Elliptic Partial Differential Equations," 2003.
[15] S. W. Smith. Music-DSP. [Online]. http://www.musicdsp.org/showone.php?id=113
[16] ITU-T. (2004, Jun.) Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs.
[17] A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "PERCEPTUAL EVALUATION OF SPEECH QUALITY (PESQ) - A NEW METHOD FOR SPEECH QUALITY ASSESSMENT OF TELEPHONE NETWORKS AND CODECS," in International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Salt Lake City, 2001, pp. 749-752.
[18] ff123. (2004, May) Discussion of Audio Compression. [Online]. http://ff123.net/abchr/abchr.html
[19] Hydrogenaudio Knowledgebase. (2007, May) Hydrogenaudio Knowledgebase. [Online]. http://wiki.hydrogenaudio.org/index.php?title=ABC/HR
[20] L. Wilhelmsson and L. B. Milstein, "On the Effect of Imperfect Interleaving for the Gilbert-Elliott Channel," Transactions on Communications, vol. 47, no. 5, pp. 681-688, May 1999.
[21] VoIP Troubleshooter.com. VoIP Troubleshooter. [Online]. http://www.voiptroubleshooter.com/indepth/burstloss.html
[22] E. Orozco, S. Villette, and A. M. Kondoz, "Multiple Description Coding for Voice over IP using Sinusoidal Speech Coding," in International Conference on Acoustics, Speech and Signal Processing, vol. 1, 2006, pp. I-9-12.
[23] Wikimedia Foundation, Inc. Wikipedia. [Online]. http://en.wikipedia.org/wiki/Gramophone_record