研究生: |
李念容 Nien-Jung Lee |
---|---|
論文名稱: |
哼唱檢索的辨識方法改進及探討 A Study on Improving the Methods for Querying by Singing/Humming |
指導教授: |
張智星
Jyh-Shing Roger Jang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 36 |
中文關鍵詞: | 旋律辨識 |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
哼唱式音樂檢索是以哼唱的方式,從大量資料庫中找出正確的歌曲。以往使用動態時間伸縮(dynamic time warping,DTW)進行辨識的結果,雖然計算時間較慢但是回傳結果較有參考價值。另一種方法是線性縮放(linear-scaling,LS),辨識速度非常的快,但是對哼唱歌曲的錯誤容忍度不如DTW來的佳,無論是音高不準確或是音長不穩定都會造成不小的影響,在哼唱正確的前提下才有較好的辨識效果。
為了改進LS的辨識效果,本篇論文提出兩種LS的變型:一種是分段式的線性縮放(segmented linear-scaling,SLS),另一種是以音符為基礎的線性縮放(note-based linear-scaling,NBLS),試圖解決使用者哼唱速度不一的問題。前者將旋律切成數段依序使用LS辨識;後者則是用音符長度為伸縮單位來進行LS,並且由此加以衍生,提出兩種方法分別稱為NBLS1和NBLS2。
本篇論文在實驗的部分觀察每種方法的辨識效果,討論他們的優點與缺點;並嘗試將DTW與其他方法進行結合,取彼此的長處,希望能夠提升只使用單一方法時的辨識率,並觀察兩者間是否存在某種關係,找出最佳的組合。除此之外,使用"人工標音"和"音高追蹤器所產生的音高"這兩種不同的測試資料進行實驗,觀察音高追蹤(pitch tracking)對辨識率的影響。
依據實驗的數據顯示,本文所提出的NBLS2確實改善了LS的缺點,有效的解決哼唱速度不一時容易辨識失敗的問題,雖然NBLS2的辨識效果沒有DTW來的優異,但是NBLS2運算所花的時間只有DTW的0.2倍,可說是損失些微辨識率卻大幅提升了辨識的效能。
最後我們將針對錯誤分析的結果提出我們改進的看法,並對本篇論文做一個結論。
Dynamic time warping (DTW) is a very effective method for query by singing/humming (QBSH), but it requires a lot of computation. On the other hand, linear scaling (LS) requires much less time on computation, but it is not as effective as DTW. As a result, in this thesis, our goal is to find new methods that can combine the advantages of DTW and LS for efficient and effective music retrieval in QBSH systems.
Specifically, we have proposed two methods in this thesis, segmented linear scaling (SLS) and note-based linear scaling (NBLS). We have performed extensive experiments to demonstrate that the proposed methods can indeed combine the effectiveness of DTW and efficiency of LS to construct a more practical QBSH system. Conclusions and future work are also addressed in the thesis.
[1]
Li, M., W.Y.Y.T.: High efficient pitch tracking method for tonal feature extraction. In: Proc of International Conference of Chinese Computing. (2001)
[2]
Li, M., Y.Y.: An humming based approach for music retrieval. In: Proc of National Conference on Man-Machine Speech Communication. (2005)
[3]
Jang, J., Hsu, C., Lee, H.: Continuous hmm and its enhancement for singing/humming query retrieval. In: Proc of ISMIR. (2005)
[4]
Dik Hermes, "Measurement of pitch by sub-harmonics summation", Journal of Acoustics of Society of America, Am 83(1), Jan.. 1988, pp. 257-264
[5]
Jang, J.-S. Roger, and Gao, Ming-Yang, "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000.
[6]
Jang, J.-S. Roger, Lee, Hong-Ru, and Kao, Ming-Yang, "Content-based Music Retrieval Using Linear Scaling and Branch-and-bound Tree Search", IEEE International Conference on Multimedia and Expo, Waseda University, Tokyo, Japan, August 2001.
[7]
Xiao Wu, Ming Li, Jian Liu, Jun Yang, and Yonghong Yan,
"A Top-down Approach to Melody Match in Pitch Contour for Query by Humming", ISCSLP 2006, The 5th international symposium on Chinese spoken language processing, December 13-16, 2006.Kent Ridge, Singapore.
[8]
J.-S. Roger Jang, Nien-Jung Lee, and Chao-Ling Hsu,
"Simple But Effective Methods for QBSH at MIREX 2006"
[9]
許肇凌, 李宏儒, 張智星" 哼唱式音樂檢索的錯誤分析與辨識率改進"
[10]
王儀蓁,張智星,"旋律辨識系統之設計:限制回應時間之效能最佳化Melody Recognition System Design:Performance Optimization with Constrained Response Time"
[11] 許肇凌, 張智星,"應用於哼唱式檢索之連續性隱藏式馬可夫模型及其強化Continuous HMM and its Enhancement for Singing or Humming Query Retrieval"