改進線性伸縮以用於哼唱選歌｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	饒彥章 Jao, Yen-Chang
論文名稱：	改進線性伸縮以用於哼唱選歌 Improving Linear Scaling for Query-by- Singing/Humming
指導教授：	張智星 Jang, Jyh-Shing 張俊盛 Chang, Jason S.
口試委員:	呂仁園 Renyuan Lyu 徐嘉連 Jia-Lien Hsu
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2014
畢業學年度：	102
語文別：	中文
論文頁數：	45
中文關鍵詞：	音樂檢索、哼唱選歌、線性伸縮、黃金比例搜尋法、序列誤差向量
外文關鍵詞：	music retrieval, query-by-singing/humming, linear scaling, golden section search, sorted error vector
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文中，我們提出了一種有效改善哼唱選歌（query by singing/humming, QBSH）的整合架構。其中包含了三種不同的改進方法。第一種方法，是利用黃金比例搜尋法（golden section search）減少傳統線性伸縮（linear scaling）的比對耗時。第二種方法，是針對音高向量（包括使用者的哼唱以及資料庫歌曲）中的休止符加入不同的權重，以減少休止符對距離計算的影響。第三種方法，則是在比對音高向量時，利用序列誤差向量（sorted error vector）的概念，忽略一部分差異過大的距離值，而改使用剩餘的距離值作為比對距離。這是為了減少因使用者哼唱技巧不足或是音高追蹤錯誤，導致的短暫音高偏差所造成的影響。
我們提出的整合方案，不僅能夠縮短辨識所需的時間（方法一），同時也提升了辨識的正確率（方法二、方法三）。根據我們在MIR-QBSH資料庫與測試語料的實驗中，我們獲得了21.4%的誤差縮減比例（error reduction rate）並減少了49.3%的比對耗時。

This thesis proposes an improved framework for improving both the efficiency and the effectiveness of a query by singing/humming (QBSH) system. The proposed framework is based on three methods. Method 1 uses golden section search to reduce the computation time in traditional linear scaling (LS) algorithm. Method 2 assigns different weights for rests (in both database songs and in queries) so that these rests now have less effect on computing the weighted distance. Method 3 utilizes a sorted error vector to ignore the LS distances that are overly large and only considers the rest of the LS distances in the computation. This reduces the effect of pitch deviation in a short time span, probably due to the singer being out of tune or errors in pitch track-ing.
The proposed framework improves the baseline system in both the computation time reduction (via scheme 1) and recognition accuracy (via schemes 2 and 3) of LS-based QBSH. Our experiment shows an error reduction rate of 21.4% in accuracy and 49.3% decrease in computation time on the MIR-QBSH dataset.

摘要        I
Abstract        II
謝誌        III
目錄        IV
圖目錄        VI
表目錄        VIII
第一章    緒論    1
1.1    研究主題    1
1.2    相關研究簡介    1
1.3    本論文之研究方向與成果    2
1.4    章節概要    3
第二章    相關理論與知識    4
2.1    線性伸縮（Linear Scaling）    4
2.2    黃金比例（Golden Ratio）    6
2.3    黃金比例搜尋法（Golden Section Search）    7
第三章    研究方法    10
3.1    使用黃金比例搜尋法加速線性伸縮    10
3.1.1    GSS over LS    11
3.1.2    GSS over LS的問題    13
3.1.3    GSS Hybrid over LS    16
3.2    加入權重之距離計算    19
3.2.2    改變休止符的權重    21
3.3    序列誤差向量（Sorted Error Vector）    23
3.4    方法之整合    24
第四章    實驗結果與分析    26
4.1    實驗環境設定    26
4.2    測試語料及資料庫    26
4.3    LsGssHybrid使用不同step size的辨識率與辨識時間分析    28
4.4    LsGss與LsGssHybrid的加速效果分析    30
4.5    不同權重的休止符之辨識率分析    32
4.6    不同SEV bound的辨識率分析    37
4.7    綜合方法的辨識率與加速效果分析    39
第五章    結論與未來研究方向    42
5.1    結論    42
5.2    未來工作    43
參考文獻        44

                                

[1] SoundHound, http://www.soundhound.com
[2] Shazam, http://www.shazam.com/
[3] Rodger J. McNab, Lloyd A. Smith, Ian H. Witten, Clare L. Henderson, Sally Jo Cunningham, “Towards the Digital Music Library: Tune Retrieval from Acoustic input,” in Proc. the 1st ACM international conference, pp. 11–18, 1996.
[4] J.-S. Roger Jang and Ming-Yang Gao, “A Query-by-Singing System based on Dynamic Programming”, International Workshop on Intelligent Systems Resolu-tions（the 8th Bellman Continuum）, pp. 85-89, 2000.
[5] J.-S. Roger Jang, Hong-Ru Lee, Ming-Yang Kao, “Content-based Music Retriev-al Using Linear Scaling and Branch-and-bound Tree Search”, IEEE International Conference on Multimedia and Expo, pp. 289-292, 2001.
[6] Norman H. Adams, Mark A. Bartsch, Gregory H. Wakefield, “Note Segmenta-tion and Quantization for Music Information Retrieval”, IEEE Transactions on Audio, Speech, and Language Processing, Volume 14, pp 131-141, 2006
[7] M. Ryynänen and A. Klapuri, “ Query by Humming of MIDI and Audio Using Locality Sensitive Hashing, ” in Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing（ICASSP'08）,pp , 2008-2012
[8] L. Wang, S. Huang, S. Hu, J. Liang, B. Xu, “Improving Searching Speed and Accuracy of Query by Humming System Based on Three Methods: Feature Fu-sion, Candidates Set Reduction and Multiple Similarity Measurement Rescoring”, 9th Annual Conference of the International Speech Communication Association（INTERSPEECH 2008）, pp. 2024-2027, 2008.
[9] “Golden section search”, from Wikipedia, http://en.wikipedia.org/wiki/Golden_section_search
[10] Kiefer, J., “Sequential minimax search for a maximum”, Proceedings of the American Mathematical Society 4（3）, pp 502–506, 1953
[11] X. Wu, M. Li, J. Liu, J. Yang, Y. Yan, “A top-down approach to melody match in pitch contour for query by humming,” in Proc. International Conference of Chi-nese Spoken Language Processing, 2006.
[12] D. Ke, B. Xu, “Chinese intonation assessment using SEV features”, in Proc. In-ternational Conference on Acoustics, Speech and Signal Processing（ICASSP ‘09）, pp. 4853-4856, 2009
[13] L. Wang, “MIREX 2012 QBSH Task: YINLONG’s Solution”, Extended Ab-stract in 8th Music Information Retrieval Evaluation eXchange（MIREX ‘12）
[14] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, “Numerical Reci-pes: The Art of Scientific Computing（3rd ed.）”, “Section 10.2. Golden Section Search in One Dimension”, ISBN 978-0-521-88068-8, 2007
[15] C.-H. Chen, “Speedup Mechanism for Comparison of Query by Sing-ing/Humming over GPUs”, National Tsing Hua University, 2012

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文