研究生: |
洪子晴 Hung, Tzu-Ching |
---|---|
論文名稱: |
以古典音樂術語為指引生成的鋼琴演奏詮釋 Piano Performance Rendering Guided by Classical Expressive Markings |
指導教授: |
劉奕汶
Liu, Yi-Wen |
口試委員: |
蘇黎
Su, Li 黃元豪 Huang, Yuan-Hao 程瓊瑩 Cheng, Chiung-ying |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2024 |
畢業學年度: | 113 |
語文別: | 英文 |
論文頁數: | 53 |
中文關鍵詞: | 富有表情的演奏詮釋 、鋼琴資料集 、古典音樂 、音樂術語 、音樂資訊檢索 |
外文關鍵詞: | Expressive performance rendering, Piano dataset, Classical music, Expressive markings, Music Information Retrieval |
相關次數: | 點閱:50 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在古典音樂中,富有表情及音樂性的演奏總是扣人心弦,演奏家透過細膩的情感詮釋呈現作曲家想傳達的音樂理念與意境,同時也映射自己對於音樂作品的理解與演奏當下的心境。其中,樂譜上的音樂術語扮演著重要的角色,如同暗語般將作曲家與演奏家的情感和思想巧妙地串聯在一起。然而,在現有的鋼琴演奏資料集中,針對音樂術語進行收錄的數據相對較少,特別是表情術語,也因此侷限了其在音樂資訊檢索領域的相關研究。本論文旨在填補這項空缺,並著手建立「古典鋼琴音樂術語演奏資料集」,我們招募了33位鋼琴演奏者,請他們分別使用不同的音樂術語演奏每首曲子,這些音樂術語涵蓋了力度記號以及表情記號,而部分作品會再另外收錄以情緒詮釋的演奏,並將所有音檔採用音樂數位介面(MIDI)的格式紀錄儲存。為了驗證資料集的適用性,我們使用適合訓練我們資料集大小的長短期記憶(LSTM)模型,設計了一套「音樂詮釋系統」,除了能根據音樂術語預測每個音符的音量、持續時間和音符間的間隔外,還能有效地學習音樂表現的細微差異,進而呈現具表現力的音樂。隨後,我們讓音樂詮釋系統針對已學習及未見過的曲子產生出富有表情的音樂,並透過兩部分的聽測實驗檢驗其表現。由實驗結果及受測者的回饋可得知,我們的音樂詮釋系統成功學習到不同音樂術語的特徵,並能有效呈現符合這些術語的音樂。此外,我們觀察到音樂術語的詮釋除了與其本身的特色有關,也與音樂作品的曲式結構和風格時期有著密切的關聯。綜觀上述,我們所收錄的古典鋼琴音樂術語演奏資料集能有效提供關於不同音樂術語的特徵資訊,對於未來在音樂演奏表現力的研究及其他音樂資訊檢索領域的應用,均展現其潛在價值。
Expressive performance in classical music plays a crucial role in shaping interpretations of musical works, allowing musicians to convey nuanced emotional and interpretative variations. However, existing datasets often provide limited attention to expressive markings in piano performances, hindering the quantitative study of these elements. This thesis addresses this gap by introducing the Expressive Markings and Emotions 33 (EME33) dataset, which captures expressive piano performances in MIDI format, annotated with dynamic markings, expression markings, and additional emotional expressions. The dataset features performances from 33 pianists, with each performer contributing multiple tracks for each piece, and in each track the performer was guided by a distinct interpretation. To validate the dataset's applicability, we employ a Long Short-Term Memory (LSTM) model which has a suitable complexity and generalization ability, given the size of our dataset. The model predicts key expressive features, such as velocity, duration, and inter-onset intervals to render expressiveness in music. A two-part listening test, consisting of familiar and unseen musical excerpts, was conducted to evaluate the model's ability to represent expressiveness in both known and novel contexts. The results confirmed the model's effectiveness in capturing and reproducing expressive performance. Subjective evaluations further verified the model's ability to reflect intended expressive elements. The analysis of these evaluations provides insights into the characteristics of expressive markings and their relationship to musical works. To summarize, the EME33 dataset demonstrates its potential as a valuable resource for research on expressive performance, and opens new avenues for future studies in music information retrieval.
[1] K. Kosta, O. F. Bandtlow, and E. Chew, “MazurkaBL: score-aligned loudness, beat, expressive markings data for 2000 Chopin Mazurka recordings,” in Proceedings of the 4th International Conference on Technologies for Music Notation and Representation (TENOR)(Montreal, QC), pp. 85–94, 2018.
[2] Q. Kong, B. Li, J. Chen, and Y. Wang, “GiantMIDI-piano: A large-scale MIDI dataset for classical piano music,” arXiv preprint arXiv:2010.07061, 2020.
[3] C. Hawthorne, A. Stasyuk, A. Roberts, I. Simon, C.-Z. A. Huang, S. Dieleman, E. Elsen, J. Engel, and D. Eck, “Enabling factorized piano music modeling and generation with the MAESTRO dataset,” in International Conference on Learning Representations, 2019.
[4] H. Zhang, J. Tang, S. R. Rafee, S. Dixon, G. A. Wiggins, and G. Fazekas, “ATEPP: A dataset of automatically transcribed expressive piano performance,” in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pp. 446–453, Dec. 2022.
[5] W. Goebl, “The Vienna 4x22 piano corpus,” 1999.
[6] P. Hu and G. Widmer, “The Batik-plays-Mozart Corpus: Linking performance to score to musicological annotations,” in Proc. ISMIR, pp. 297–303, 2023.
[7] D. Jeong, T. Kwon, Y. Kim, K. Lee, and J. Nam, “Virtuosonet: A hierarchical RNN-based system for modeling expressive piano performance,” in Proc. ISMIR, pp. 908–915, 2019.
[8] A. Fujishima, K. Maezawa, and T. Yamamoto, “Rendering music performance with interpretation variations using conditional variational RNN,” in Proc. ISMIR, pp. 855–861, 2019.
[9] H. Zhang and S. Dixon, “Disentangling the Horowitz factor: Learning content and style from expressive piano performance,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, IEEE, 2023.
[10] L. Renault, R. Mignot, and A. Roebel, “Expressive piano performance rendering from unpaired data,” in International Conference on Digital Audio Effects (DAFx23), 2023.
[11] Y.-H. Chou, I. Chen, C.-J. Chang, J. Ching, Y.-H. Yang, et al., “MidiBERT- Piano: Large-scale pre-training for symbolic music understanding,” arXiv preprint arXiv:2107.05223, 2021.
[12] M. Zeng, X. Tan, R. Wang, Z. Ju, T. Qin, and T.-Y. Liu, “MusicBERT: Symbolic music understanding with large-scale pre-training,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, (Online), pp. 791–800, Aug. 2021.
[13] J. Tang, G. Wiggins, and G. Fazekas, “Reconstructing human expressiveness in piano performances with a transformer network,” arXiv preprint arXiv:2306.06040, 2023.
[14] Q. Kong, B. Li, X. Song, Y. Wan, and Y. Wang, “High-resolution piano transcription with pedals by regressing onset and offset times,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3707–3717, 2021.
[15] F. Foscarin, A. Mcleod, P. Rigaux, F. Jacquemard, and M. Sakai, “ASAP: a dataset of aligned scores and performances for piano transcription,” in Proc. ISMIR, pp. 534–541, 2020.
[16] P.-C. Li, L. Su, Y.-H. Yang, A. W. Su, et al., “Analysis of expressive musical terms in violin using score-informed and expression-based audio features,” in Proc. ISMIR, pp. 809–815, 2015.
[17] I. N. Okeke, “The ambiguity of musical expression marks and the challenges of teaching and learning keyboard instruments: The Nnamdi Azikiwe University experience,” UJAH: Unizik Journal of Arts and Humanities, vol. 18, no. 1, 2017.
[18] S. Canazza, G. D. Poli, A. Rodà, and A. Vidolin, “Analysis by synthesis of the expressive intentions in musical performance,” in Proceedings of the International Computer Music Conference (ICMC), Thessaloniki, Greece, Michigan Publishing, Sep. 25-30, 1997.
[19] C. E. Cancino-Chacón, M. Grachten, W. Goebl, and G. Widmer, “Computational models of expressive music performance: A comprehensive and critical review,” Frontiers in Digital Humanities, vol. 5, 24 October 2018.
[20] C.-Z. A. Huang, A. Vaswani, J. Uszkoreit, N. Shazeer, I. Simon, C. Hawthorne, A. M. Dai, M. D. Hoffman, M. Dinculescu, and D. Eck, “Music Transformer: Generating music with long-term structure,” arXiv preprint arXiv:1809.04281, 2018.
[21] P. Bontempi, S. Canazza, F. Carnovalini, and A. Rodà, “Research in computational expressive music performance and popular music production: A potential field of application?,” Multimodal Technologies and Interaction, 2023.
[22] T. Collins and M. Barthet, “Expressor: A transformer model for expressive MIDI performance,” in Proceedings of the 16th International Symposium on CMMR, (Tokyo, Japan), Nov. 13–17, 2023.
[23] Y. Wu, E. Manilow, Y. Deng, R. Swavely, K. Kastner, T. Cooijmans, A. Courville, C.-Z. A. Huang, and J. Engel, “MIDI-DDSP: Detailed control of musical performance via hierarchical modeling,” arXiv preprint arXiv:2112.09312v2, March 2022.
[24] T.-Y. Hung, J.-T. Wu, Y.-C. Kuo, Y.-W. Hsiao, T.-W. Lin, and L. Su, “A study on synthesizing expressive violin performances: Approaches and comparisons,” arXiv preprint arXiv:2406.18089v1, June 2024.
[25] E. Nakamura, K. Yoshii, and H. Katayose, “Performance error detection and post-processing for fast and accurate symbolic music alignment,” in Proc. ISMIR, pp. 347–353, 2017.
[26] D. F. Somorjay, “Musicology and performance practice: In search of a historical style with Bach recordings,” Studia Musicologica Academiae Scientiarum Hungaricae, pp. 77–106, 2000.