研究生: |
林孟樺 Lin, Meng-Hua |
---|---|
論文名稱: |
使用排序學習演算法產生重新排名以改進的音訊指紋辨識 An Effective Re-ranking Method Based on Learning to Rank for Improving Audio Fingerprinting |
指導教授: |
張智星
Jang, Jyh-Shing 張俊盛 Chang, Jason S. |
口試委員: |
呂仁園
Ren-Yuan Lyu 徐嘉連 Jia-Lien Hsu |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 中文 |
論文頁數: | 47 |
中文關鍵詞: | 音樂檢索 、音訊指紋辨識 、排序學習演算法 、PRanking 、Ranking SVM 、ListNet |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
音訊指紋辨識是一種快速的音樂檢索方式,透過麥克風收音,將錄製的歌曲傳送到辨識系統進行運算,最後將最符合的結果回傳給使用者。但在現實生活當中,使用者所在的環境可能是餐廳、超市等嘈雜的環境,使原本音樂被噪音汙染,導致辨識率下降。本論文將針對被噪音污染的辨識情況進行提升辨識率的改良。
我們建立一個兩階段的辨識系統,針對第一階段結果建立重新排序門檻判斷條件,將不符合條件的辨識歌曲進行第二階段辨識。第二階段進行聲音頻率和時間的比對,比較辨識片段和歌曲的頻率及時間相似度,將第一階段辨識結果的前十名歌曲進行重新排序,並使用排序學習演算法的3種方法,分別為逐點式、成對式和序列式方法。實驗結果顯示我們的改良方法確實可以讓辨識率達到提升的效果。
Audio Fingerprinting (AFP) is a fast way of music retrieval. It first records a segment of a music through the microphone on a cellphone or tablet device, and sends the recorded segment to the server for AFP computation. The server returns the most possible song to the user. However, in a real life scenario, a user commonly records the sound in a noisy environment, such as a restaurant or a supermarket. The noise might distort the recording and thus degrades the accuracy of AFP. The goal of my research is to improve the accuracy of the system in a noisy environment.
The recognition system was developed in two stages. The first stage compute the confidence score for the query. The query with a low confidence score goes to the second stage for re-ranking. In the second stage, the frequency and time between the query and top 10 songs obtained from the first stage are compared, and the top 10 songs are re-ranked to improve the recognition accuracy. Three learning to rank methods are used to deal with the ranking problem, including the pointwise, the pairwise and the listwise approaches. Experimental result shows that the proposed re-ranking method is able to improve the recognition rate.
文獻參考
【1】 Shazam, web resource, available: http://www.shazam.com/
【2】 SoundHound, web resource, available: http://www.soundhound.com/
【3】 TrackID in Google play, web resource, available: https://play.google.com/store/apps/details?id=com.sonyericsson.trackid
【4】 Audible Magic, web resource, available: https://www.audiblemagic.com/
【5】 Avery Li-Chun Wang,“An Industrial-Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR), 2003.
【6】 Yan Ke, Derek Hoiem, and Rahul Sukthankar,“Computer Vision for Music Identification,”in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
【7】 Mansoo Park, Hoi-Rin Kim, and Seung Hyun Yang,“Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environment,”ETRI Journal, 2006
【8】 Elsa Dupraz and Gaël Richard,“Robust Frequency-Based Audio Fingerprinting,”in ICASSP 2010, IEEE International Conference on Acoustic, Speech and Signal Processing, (Dallas, USA), 2010
【9】 Sébastien Fenet, Yves Grenier, and Gaël Richard,“An Extended Audio-Fingerprint Method with Capabilities for Similar Music Detection,”in Proceedings of the 14th International Conference on Music Information Retrieval (ISMIR), 2013.
【10】 Tie-Yan Liu, Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, 3(3), 2009
【11】 Norbert Fuhr,“Optimum polynomial retrieval functions based on the probability ranking principle,”ACM Transactions on Information Systems (TOIS), 1989
【12】 Ramesh Nallapati,“Discriminative Models for Information Retrieval,”in SIGIR 27, 2004
【13】 Koby Crammer, Yoram Singer,“Pranking with Ranking,”in Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2002
【14】 Thorsten Joachims,“Optimizing Search Engines using Clickthrough Data,”in Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002
【15】 Thorsten Joachims,“Support Vector Machine for Ranking, ”web resource, available: http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html
【16】 Cao Zhe, Qin Tao, Tie-Yan Liu, Ming-Feng Tsai, Hang Li,“Learning to rank: From pairwise approach to listwise approach,”in Proceedings of the International Conference on Machine Learning (ICML), 2007
【17】 Cao Zhe, Qin Tao, Tie-Yan Liu, Ming-Feng Tsai, Hang Li,“Learning to rank: From pairwise approach to listwise approach, ”(Technical Report MSR-TR-2007-40) , 2007
【18】 Dan Ellis (2009),“Robust Landmark-Based Audio Fingerprinting,”web resource, available: http://labrosa.ee.columbia.edu/matlab/fingerprint/