使用排序學習演算法產生重新排名以改進的音訊指紋辨識

簡易檢索 / 詳目顯示

回結果列表

研究生：	林孟樺 Lin, Meng-Hua
論文名稱：	使用排序學習演算法產生重新排名以改進的音訊指紋辨識 An Effective Re-ranking Method Based on Learning to Rank for Improving Audio Fingerprinting
指導教授：	張智星 Jang, Jyh-Shing 張俊盛 Chang, Jason S.
口試委員:	呂仁園 Ren-Yuan Lyu 徐嘉連 Jia-Lien Hsu
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2014
畢業學年度：	102
語文別：	中文
論文頁數：	47
中文關鍵詞：	音樂檢索、音訊指紋辨識、排序學習演算法、PRanking 、Ranking SVM 、ListNet
相關次數：	點閱：77 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

音訊指紋辨識是一種快速的音樂檢索方式，透過麥克風收音，將錄製的歌曲傳送到辨識系統進行運算，最後將最符合的結果回傳給使用者。但在現實生活當中，使用者所在的環境可能是餐廳、超市等嘈雜的環境，使原本音樂被噪音汙染，導致辨識率下降。本論文將針對被噪音污染的辨識情況進行提升辨識率的改良。
我們建立一個兩階段的辨識系統，針對第一階段結果建立重新排序門檻判斷條件，將不符合條件的辨識歌曲進行第二階段辨識。第二階段進行聲音頻率和時間的比對，比較辨識片段和歌曲的頻率及時間相似度，將第一階段辨識結果的前十名歌曲進行重新排序，並使用排序學習演算法的3種方法，分別為逐點式、成對式和序列式方法。實驗結果顯示我們的改良方法確實可以讓辨識率達到提升的效果。

Audio Fingerprinting (AFP) is a fast way of music retrieval. It first records a segment of a music through the microphone on a cellphone or tablet device, and sends the recorded segment to the server for AFP computation. The server returns the most possible song to the user. However, in a real life scenario, a user commonly records the sound in a noisy environment, such as a restaurant or a supermarket. The noise might distort the recording and thus degrades the accuracy of AFP. The goal of my research is to improve the accuracy of the system in a noisy environment.
The recognition system was developed in two stages. The first stage compute the confidence score for the query. The query with a low confidence score goes to the second stage for re-ranking. In the second stage, the frequency and time between the query and top 10 songs obtained from the first stage are compared, and the top 10 songs are re-ranked to improve the recognition accuracy. Three learning to rank methods are used to deal with the ranking problem, including the pointwise, the pairwise and the listwise approaches. Experimental result shows that the proposed re-ranking method is able to improve the recognition rate.

目錄
摘要    I
Abstract    II
謝誌    III
目錄    IV
圖目次    VI
表目次    VIII
第一章    緒論    1
1    研究背景    1
2    研究目的    2
3    相關研究    2
4    章節概要    4
第二章    音訊指紋辨識系統    5
1    系統簡介及流程    5
2    抽取landmark    6
2.1    尋找peak點    7
2.2    組成Landmark    8
3    Landmark轉成雜湊鍵及雜湊值    9
4    辨識歌曲片段    10
4.1    從雜湊表取出相對應雜湊鍵的雜湊值    10
4.2    偏移時間 (offset time)    10
4.3    統計分析    12
第三章    研究方法與實作    14
1    改良目的    14
1.1    改良觀念    14
2    Match Peak Count I (MPCI)    15
2.1    假設    16
2.2    改良方法    16
3    Match Peak Count II (MPCII)    20
3.1    假設    20
3.2    改良方法    20
4    MPCI和MPCII的差異    22
5    排序學習演算法    23
5.1    PRanking    24
5.2    Ranking SVM    25
5.3    ListNet    29
第四章    實驗結果與分析討論    33
1    實驗環境設定    33
2    實驗語料庫介紹    33
3    兩階段辨識    35
3.1    重新排序門檻    35
3.2    重新排序Top N    37
4    Match Peak Count I 和Match Peak Count II    38
4.1    實驗目的    38
4.2    實驗方式    38
4.3    實驗結果與分析    39
5    Learning to Rank    41
5.1    實驗目的    41
5.2    實驗方式    41
5.3    實驗結果與分析    42
第五章    結論與未來展望    45
1    結論    45
2    未來展望    45
文獻參考    46

                                

文獻參考
【1】 Shazam, web resource, available: http://www.shazam.com/
【2】 SoundHound, web resource, available: http://www.soundhound.com/
【3】 TrackID in Google play, web resource, available: https://play.google.com/store/apps/details?id=com.sonyericsson.trackid
【4】 Audible Magic, web resource, available: https://www.audiblemagic.com/
【5】 Avery Li-Chun Wang,“An Industrial-Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR), 2003.
【6】 Yan Ke, Derek Hoiem, and Rahul Sukthankar,“Computer Vision for Music Identification,”in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
【7】 Mansoo Park, Hoi-Rin Kim, and Seung Hyun Yang,“Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environment,”ETRI Journal, 2006
【8】 Elsa Dupraz and Gaël Richard,“Robust Frequency-Based Audio Fingerprinting,”in ICASSP 2010, IEEE International Conference on Acoustic, Speech and Signal Processing, (Dallas, USA), 2010
【9】 Sébastien Fenet, Yves Grenier, and Gaël Richard,“An Extended Audio-Fingerprint Method with Capabilities for Similar Music Detection,”in Proceedings of the 14th International Conference on Music Information Retrieval (ISMIR), 2013.
【10】 Tie-Yan Liu, Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, 3(3), 2009
【11】 Norbert Fuhr,“Optimum polynomial retrieval functions based on the probability ranking principle,”ACM Transactions on Information Systems (TOIS), 1989
【12】 Ramesh Nallapati,“Discriminative Models for Information Retrieval,”in SIGIR 27, 2004
【13】 Koby Crammer, Yoram Singer,“Pranking with Ranking,”in Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2002
【14】 Thorsten Joachims,“Optimizing Search Engines using Clickthrough Data,”in Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002
【15】 Thorsten Joachims,“Support Vector Machine for Ranking, ”web resource, available: http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html
【16】 Cao Zhe, Qin Tao, Tie-Yan Liu, Ming-Feng Tsai, Hang Li,“Learning to rank: From pairwise approach to listwise approach,”in Proceedings of the International Conference on Machine Learning (ICML), 2007
【17】 Cao Zhe, Qin Tao, Tie-Yan Liu, Ming-Feng Tsai, Hang Li,“Learning to rank: From pairwise approach to listwise approach, ”(Technical Report MSR-TR-2007-40) , 2007
【18】 Dan Ellis (2009),“Robust Landmark-Based Audio Fingerprinting,”web resource, available: http://labrosa.ee.columbia.edu/matlab/fingerprint/

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文