簡易檢索 / 詳目顯示

研究生: 蔡承育
Cai, Cheng-Yu
論文名稱: 基於非負值矩陣分解的雙聲道鼓聲分離技術及其應用
A NMF-based Dual-channel Drum Separation Technique and Its Application
指導教授: 蘇黎
Su, Li
蘇郁惠
Su, Yu-Huei
口試委員: 姚書農
Yao, Shu-Nung
彭冠舉
Peng, Guan-Ju
學位類別: 碩士
Master
系所名稱: 藝術學院 - 音樂學系所
Music
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 69
中文關鍵詞: 非負矩陣聲源分離鼓聲分離自動混音
外文關鍵詞: Non-negative Matrix Factorization(NMF), Source Separation, Drum Separation, Automix
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在探討非負矩陣(NMF)對於雙聲道鼓組訊號聲源分離的效果,並將聲源分離結果更進一步應用自動混音(Automix)。本論文首度提出使用兩支overhead 麥克風訊號組成雙聲道鼓組訊號,並從其中分離出八種鼓組樂器。

    鼓組訊號聲源分離實驗的結果如下。分離方式以Score-informed雙聲道 NMF最佳。綜合考量下,使用雙聲道聲源分離的效果比使用單聲道聲源分離、業界工具聲源分離的效果佳且泛用。相較於KL散度(Kullback Leibler Divergence, KL Divergence)以及板倉齋藤距離(Itakura Saito Divergence, IS Divergence),聲源分離系統使用歐式距離(Euclidean Distance)作為成本函式效果最佳。
    自動混音實驗的結果如下。自動混音後比自動混音前的結果更接近人工混音。

    鼓組訊號聲源分離實驗的建議如下。為了使NMF更精確地分離,建議增加更多不同位置麥克風的dictionary。為了提升鼓組聲源分離系統的表現,建議使用更進階方法,例如神經網路來解鼓組聲源分離議題。
    自動混音實驗的建議如下。為自動混音系統增加除了音量的混音方式。例如等化器(Equalizer)、殘響(Reverb)、Pan。並將人耳對於不同頻段的敏感度不同列入自動混音系統。


    The purpose of this study is to investigate the effect of a Nonnegative Matrix Factorization(NMF)based dual-channel drum separation system and apply the drum separation result to an automix system further. This study proposes the idea of combining two overhead microphone signals in dual-channel drum signal and tries to separate them.

    The contributions of the source separation system are summarized as follows. The optimal separation method is Score-informed dual-channel NMF. Considering all the factors, the proposed dual-channel source separation system is more all- purpose and outperforms the single channel source separation system, or and commercial DAW source separation plugin. Compared to KL Divergence(Kullback Leibler Divergence)or IS Divergence(Itakura Saito Divergence), Euclidean Distance is optimal cost function of source separation system.
    The contributions of the automix system are summarized as follows. The results of automix are better than the one without automix. That means, the results of automix are found similar to the results produced by human.

    The suggestions of the source separation system are summarized as follows. To separate drum sound more accurately, a larger scale of NMF dictionary atoms such as differently placing microphone is suggested. To improve the performance of drum separation, using more advanced methods such as deep neural networks is suggested.
    The suggestions of the automix system are summarized as follows. Besides mixing with magnitude , it is suggested to use more mixing methods to automix system, such as equalizer, reverb, pan. And designing automix system is suggested with considering the different frequency sensitivity to human.

    第壹章、緒論 1 第一節、研究動機 1 第二節、研究問題 2 第貳章、文獻探討 5 第一節、鼓組 5 一、大鼓 6 二、小鼓 7 三、高帽鈸 7 四、高架鼓 7 五、落地鼓 7 六、鈸 7 第二節、非負矩陣分解 8 一、非負矩陣分解的基本概念 8 二、NMFD 10 第三節、非負矩陣分解應用於鼓聲分離 11 第四節、自動混音 15 第五節、業界工具 16 一、鼓組聲源分離業界工具 16 二、自動混音業界工具 21 第參章、實驗流程 29 第一節、Data 29 一、Training Data 29 二、Testing Data 32 三、Evaluation data 34 四、Imitation data 34 第二節、抽取Dictionary W 35 第三節、Dual-channel NMF 36 第四節、Single channel NMF 38 第五節、Score-informed dual-channel NMF 39 第六節、自動混音 40 第肆章、實驗結果 43 第一節、基本鼓組分離實驗結果 44 一、Dual-channel NMF 實驗結果 44 二、Single Channel NMF 實驗結果 46 三、Score-informed dual-channel NMF實驗結果 47 第二節、進階鼓組分離實驗結果 49 一、Dual-channel NMF 實驗結果 49 二、Single Channel NMF 實驗結果 52 三、Score-informed dual-channel NMF實驗結果 55 四、聲源分離之業界工具實驗結果 58 第三節、自動混音實驗結果 59 第伍章、結論與建議 60 第一節、結論 60 一、非負矩陣雙聲道鼓聲分離系統結論 61 二、自動混音系統結論 61 第二節、建議 61 一、非負矩陣雙聲道鼓聲分離系統建議 61 二、自動混音系統建議 62 三、人文社會學生跨領域學習資訊工程之建議 62 參考文獻 64 英文文獻 64 中文文獻 68 附錄 69

    英文文獻
    1. Accusonus. Regroover. Retrieved May 1, 2021, from https://dp5cgtrb0qdma.cloudfront.net/sites/default/files/assets/181/regroover-ui-awards-v2-min.png
    2. Mario, C. (2012). Celso Alberti on drums. In. flickr: ShambhuMusic. Retrieved May 19, 2021, from https://www.flickr.com/photos/62532462@N05/7734550948
    3. Audio, N. S. (2020)。 Perfect Drum Instrument List。 Retrieved May 1, 2021, from https://theperfectdrums.com/media
    4. Battenberg, E., Huang, V., & Wessel, D. (2012). Live drum separation using probabilistic spectral clustering based on the Itakura-Saito divergence. Paper presented at the Proceedings of the AES 45th Conference on Time-Frequency Processing in Audio, Helsinki, Finland.
    5. Chen, Y., Zhang, H., Liu, R., Ye, Z., & Lin, J. (2019). Experimental explorations on short text topic mining between LDA and NMF based Schemes. Knowledge-Based Systems, 163, 1-13. doi:https://doi.org/10.1016/j.knosys.2018.08.011
    6. Dittmar, C., & Gärtner, D. (2014). Real-Time Transcription and Separation of Drum Recordings Based on NMF Decomposition. Paper presented at the DAFx.
    7. Dittmar, C., & Müller, M. (2016). Reverse Engineering the Amen Break — Score-Informed Separation and Restoration Applied to Drum Recordings. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(9), 1535-1547. doi:10.1109/TASLP.2016.2567645
    8. Fuse, A. L. Drums SSX Drum Remixer. Retrieved from https://fuseaudiolabs.com/#/pages/product?id=300867907
    9. Gillet, O., & Richard, G. (2006). ENST-Drums: an extensive audio-visual database for drum signals processing. Paper presented at the ISMIR.
    10. Gillet, O., & Richard, G. (2008). Transcription and separation of drum signals from polyphonic music. IEEE Transactions on Audio, Speech, and Language Processing, 16(3), 529-540.
    11. HoRNet. AutoGain. Retrieved May 1, 2021, from https://www.hornetplugins.com/plugins/hornet-autogain/
    12. Hung, Y.-N., & Lerch, A. (2020). Multitask Learning for Instrument Activation Aware Music Source Separation. arXiv preprint arXiv:2008.00616.
    13. iZotope. Ozone 9 Advanced. Retrieved May 1, 2021, from https://www.izotope.com/en/shop/ozone-9-advanced.html
    14. iZotope. RX Music Rebalance. Retrieved May 1, 2021, from https://www.izotope.com/
    15. Janer, J., Marxer, R., & Arimoto, K. (2012, 25-30 March 2012). Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation. Paper presented at the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
    16. Jansson, A., Bittner, R. M., Ewert, S., & Weyde, T. (2019, 2-6 Sept. 2019). Joint Singing Voice Separation and F0 Estimation with Deep U-Net Architectures. Paper presented at the 2019 27th European Signal Processing Conference (EUSIPCO).
    17. Lee, D. D., & Seung, H. S. (1999). Learning the Parts of Objects by Non-negative Natrix Factorization. Nature, 401(6755), 788-791. doi:10.1038/44565
    18. Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. Paper presented at the Advances in neural information processing systems.
    19. Lesnitsky, A. (2016). Drums. Retrieved May 1, 2021, from https://pixabay.com/photos/drums-tools-percussion-music-1696802/
    20. López-Serrano, P., Dittmar, C., Özer, Y. i., & Müller, M. (2019). NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization. Paper presented at the International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK.
    21. Maximos, A., Floros, A., Vrahatis, M. N., & Kanellopoulos, N. (2012). Real-time drums transcription with characteristic bandpass filtering. Paper presented at the Proceedings of the 7th Audio Mostly Conference: A Conference on Interaction with Sound.
    22. Noise, Y. DrumExtract. Retrieved May 1, 2021, from https://www.yellownoiseaudio.com/img/DrumExtractMacScreenshot.png
    23. Paatero, P., Tapper, U., Aalto, P., & Kulmala, M. (1991). Matrix Factorization Methods for Analysing Diffusion Battery Data. Journal of Aerosol Science, 22, S273-S276. doi:https://doi.org/10.1016/S0021-8502(05)80089-8
    24. Pachauri, N., & Wajid, M. (2020, 27-28 Feb. 2020). Single Channel Beatbox Music Separation Using Non-negative Matrix Factorisation. Paper presented at the 2020 7th International Conference on Signal Processing and Integrated Networks (SPIN).
    25. Paul, W. (2019). iZotope Neutron 3 Advanced. Retrieved May 1, 2021, from https://www.soundonsound.com/reviews/izotope-neutron-3-advanced
    26. Radix, S. Auto Align. Retrieved May 1, 2021, from https://www.soundradix.com/products/auto-align/
    27. Rathnayake, B., Weerakoon, K. M. K., Godaliyadda, G. M. R. I., & Ekanayake, M. P. B. (2018, 18-21 Nov. 2018). Toward Finding Optimal Source Dictionaries for Single Channel Music Source Separation Using Nonnegative Matrix Factorization. Paper presented at the 2018 IEEE Symposium Series on Computational Intelligence (SSCI).
    28. Scott, J., Prockup, M., Schmidt, E. M., & Kim, Y. E. (2011). Automatic multi-track mixing using linear dynamical systems. Paper presented at the Proceedings of the 8th Sound and Music Computing Conference, Padova, Italy.
    29. Shi, C., & Wu, C. (2020). Vehicle Face Recognition Algorithm Based on Weighted Nonnegative Matrix Factorization with Double Regularization Terms. KSII Transactions on Internet and Information Systems (TIIS), 14(5), 2171-2185. doi:https://doi.org/10.3837/tiis.2020.05.017
    30. Smaragdis, P. (2004). Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. Paper presented at the International Conference on Independent Component Analysis and Signal Separation.
    31. Soundradix. Drum Leveler. Retrieved May 1, 2021, from https://assets.soundradix.com/static/sr_content/img/products/drum-leveler/productinner.c2ea4c3bd7fa.png
    32. Su, S., Chiu, C., Su, L., & Yang, Y. (2017, 5-9 March 2017). Automatic conversion of Pop music into chiptunes for 8-bit pixel art. Paper presented at the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
    33. Tomislav, Z. (2019). Sonible smart:EQ+ Review. Retrieved May 1, 2021, from https://bedroomproducersblog.com/2017/02/28/sonible-smarteq-review/#prettyPhoto
    34. Vincent, E. (2011). BSS Eval (Version 3.0). Retrieved from http://bass-db.gforge.inria.fr/bss_eval/
    35. Vincent, E., Gribonval, R., & Févotte, C. (2006). Performance measurement in blind audio source separation. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1462-1469.
    36. Waves. Dugan Automixer. Retrieved May 1, 2021, from https://www.waves.com/plugins/dugan-automixer
    37. Waves. Vocal Rider. Retrieved May 1, 2021, from https://www.waves.com/plugins/vocal-rider
    38. Yoo, J., Kim, M., Kang, K., & Choi, S. (2010, 14-19 March 2010). Nonnegative Matrix Partial Co-factorization for Drum Source Separation. Paper presented at the 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

    中文文獻
    1. Philip, W.(2017)。 改變音樂的50種樂器(殷德倫譯)(第1版)。 台北市:積木文化。
    2. 丁麟(2002)。 實戰爵士鼓(第2版)。 台北市:麥書國際文化事業有限公司。
    3. 大禾音樂製作編輯部(2019a)。 3-3 為音樂建構穩固的鼓架-鼓組錄音法。 載於 第一本照著做就0失誤的音樂製作工具書。(第 1 版,頁 108)。 台北市: 城邦印書館股份有限公司。
    4. 大禾音樂製作編輯部(2019b)。 6-1 從創作發想到製作完成的流程回顧。 載於 第一本照著做就0失誤的音樂製作工具書。(第 1 版,頁 200-203)。 台北市: 城邦印書館股份有限公司。
    5. 尾崎元章(2016)。 Drum Fans 蠱惑人心 I。(第 6 版)。 台北市: 典弦音樂文化國際事業有限公司。
    6. 揚聲堡音樂中心。 VocAlign Pro 4 。 2021年5月1日 擷取至 https://shop.cyuncai.com

    QR CODE