簡易檢索 / 詳目顯示

研究生: 王崇喆
Wang, Chung-Che
論文名稱: 同時使用旋律與歌詞資訊之改良型哼唱檢索系統
An Improved Query by Singing/Humming System Using Melody and Lyrics Information
指導教授: 張智星
Jang, Jyh-Shing Roger
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 31
中文關鍵詞: 哼唱選歌哼唱分辨歌詞辨識音素相似程度音節相似程度
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出了一種改進的哼唱檢索系統,能同時使用旋律和歌詞的資訊,以實現更好的性能。首先會進行哼唱分辨,以將「唱」和「哼」分離開來。對於「哼」的查詢,我們套用了只含音高的旋律辨識方法,其在參加MIREX的哼唱檢索項目時被使用,並在該比賽中排名第一。對於「唱」的查詢,我們將旋律辨識和歌詞辨識的分數結合,以利用額外的歌詞資訊。歌詞辨識是基於一個改良過的樹狀網路,此技術常用在語音辨識上。系統整體效能,以錯誤減少率來看,在兩個不同的實驗參數下的前20名之結果中,分別達到 39.01%和23.53%,說明了此系統的可行性。


    This paper proposes an improved query by singing/humming (QBSH) system using both melody and lyrics information for achieving better performance. Singing/humming discrimination (SHD) is first performed to distinguish singing from humming queries. For a humming query, we apply a pitch-only melody recognition method that has been used for QBSH task at MIREX with rank-1 per-formance. For a singing query, we combine the scores from melody recognition and lyrics recognition to take advantage of the extra lyrics information. Lyrics recognition is based on a modified tree lexicon that is commonly used in speech recognition. The performance of the overall QBSH system achieves 39.01% and 23.53% error reduction rates, respectively, for top-20 recognition under two experimental settings, indicating the feasibility of the proposed method.

    Chapter 1. Introduction 1 Chapter 2. System Overview 3 2.1 Phone and Syllable Similarity 4 2.2 Singing/Humming Discrimination 7 2.3 Lyrics Recognition 8 2.4 Lyrics Scoring and Combination 11 Chapter 3. Experiments 14 3.1 The Dataset 14 3.2 Experimental Results 14 3.2.1 Results of SHD 14 3.2.2 Lyrics recognition and Combined Results 18 Chapter 4. Conclusions and Future Work 25 References 26 Appendix A: The List of 156 Bi-phones 28 Appendix B: The List of 423 Syllables 29

    [1] A. J. Ghias, D. C. Logan, and B. C. Smith, “Query by humming-musical information retrieval in an audio database,” in Proc. ACM Multimedia’95, San Francisco, 1995, pp. 216–221.
    [2] R. J. McNab, L. A. Smith, I. H. Witten, C. L. Henderson, and S. J. Cunningham, “Toward the digital music library: Tune retrieval from acoustic input,” in Proc. ACM Digital Libraries, 1996, pp. 11–18.
    [3] J.-S. R. Jang and M.-Y. Gao, “A query-by-Singing system based on dynamic programming,” in Proc. Int. Workshop Intell. Syst. Resolutions (8th Bellman Continuum), Hsinchu, Taiwan, R.O.C., Dec. 2000, pp. 85–89.
    [4] Y.-S. Wu, W.-r. Chu, C.-Y. Chi, D. C. Wu, R. T.-H. Tsai, and J Y.-j Hsu, “The Power of Words: Enhancing Music Mood Estimation with Textual Input of Lyrics,” International Conference on Affective Computing & Intelligent Interaction, pp. 1–6, 2009.
    [5] T. Wang, D.-J. Kim, K.-S. Hong, and J.-S. Youn, “Music Information Retrieval System using Lyrics and Melody Information,” Asia-Pacific Conference on Information Processing, pp. 601–604, 2009.
    [6] X. Xu, M. Naito, T. Kato, and H. Kawai, “Robust and Fast Lyric Search Based on Phonetic Confusion Matrix,” Proceedings of the International Symposium on Music Information Retrieval, pp. 417–422, 2009.
    [7] J.-S. R. Jang, H.-R. Lee, M.-Y. Kao, “Content-based Music Retrieval Using Linear Scaling and Branch-and-Bound Tree search,” in Proc. of IEEE International Conference on Multimedia and Expo, August 2001.
    [8] Cambridge University Engineering Department , HTK Web-Site, http://htk.eng.cam.ac.uk/, 2006
    [9] AT&T Labs Research , AT&T Labs Research - FSM Library , http://www2.research.att.com/~fsmtools/fsm/, 2008
    [10] J.-S. R. Jang, "MIR-QBSH Corpus", MIR Lab, CS Dept, Tsing Hua Univ, Taiwan. Available at the "MIR-QBSH Corpus" link at http://www.cs.nthu.edu.tw/~jang.
    [11] J.-C. Chen, J.-S. R. Jang, "TRUES: Tone Recognition Using Extended Segments", ACM Transactions on Asian Language Information Processing, No. 10, Vol. 7, Aug 2008.
    [12] MIREX 2009, http://www.music-ir.org/mirexwiki/index.php/Main_Page, 2009
    [13] M. Suzuki, T. Hosoya, A. Ito, and S. Makino, “Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information,” EURASIP Journal on Advances in Signal Processing, vol. 2007, Article ID 38727, 8 pages, 2007. doi:10.1155/2007/38727
    [14] J.-H Chen, “Content-based Music Emotion Analysis and Recognition”, Master Thesis, CS Dept., National Tsing Hua University, June 2006

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE