簡易檢索 / 詳目顯示

研究生: 詹詩涵
Shih-Han Chan
論文名稱: 基於音高調節之歌聲合成系統
A Singing Voice Synthesis System Based On Pitch Curve Modulation
指導教授: 張智星
Jyh-Shing Roger Jang
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2006
畢業學年度: 94
語文別: 中文
論文頁數: 42
中文關鍵詞: 歌聲合成音高曲線基頻軌跡
外文關鍵詞: Singing Voice Synthesis, Pitch Curve, Pitch Contour
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本論文中,我們藉由調整音高曲線的方式來提高合成歌聲的自然度。論文的重點在於探討如何產生與實際歌聲相近的音高曲線,以作為合成歌聲的依據,並且提出兩種方式實作:(1)使用支撐向量機(SVM, Support Vector Machine)方法來預測音高曲線; (2)使用我們提出的規則式基礎的音高預測方程式,來模擬十種不同條件下的音高曲線。此外,我們使用基於基週同步為基礎的cross-fading 方法來解決語音接合不連續的問題,且加入了抖音 (Vibrato)和回響音(Reverberation)等特效來美化合成歌聲。最後,經由聽測實驗證實,相較於傳統歌聲合成方法,使用我們提出的規則式音高預測方程式將能使合成音色更自然悅耳。


    In this study, a singing voice synthesis system is proposed. We improve the naturalness of the synthetic singing voice via the modification of pitch curves.
    Our goal is to produce a pitch curve similar to that of actual singing voice. We employ two methods for pitch-curve prediction: In the first method, we use support vector machine (SVM) to train a regression model to predict pitch curves. In the second method, we propose a rule-based approach comprising 10 manually-tuned equations for the pitch curves under different conditions.
    In the second half of the thesis, we discuss the signal processing techniques that are applied to modify pitch, duration and volume. We further solve the problems of ill-articulated pronunciation and discontinuity in the syllable concatenation by using pitch synchronous based crossing fading approach. Moreover, we also create some euphonious effects, such as vibrato and reverberation.
    Finally, we assess the performance of the proposed methods via pitch curve observation and a listening test experiment. It is verified that the proposed rule-based approach actually is able to make the synthetic singing voices more natural as compared with other traditional singing voice synthesis approaches.

    第 1 章 緒論........................................................................................................... - 1 - 1.1 研究主題....................................................................................................- 1 - 1.2 研究目標與方法........................................................................................- 1 - 1.3 相關研究....................................................................................................- 1 - 1.4 章節概述....................................................................................................- 2 - 第 2 章 歌聲合成背景........................................................................................... - 3 - 2.1 聲音要素....................................................................................................- 3 - 2.2 歌聲與語音特性........................................................................................- 4 - 2.3 歌聲合成方法............................................................................................- 5 - 第 3 章 歌聲合成系統........................................................................................... - 8 - 3.1 系統簡介....................................................................................................- 8 - 3.2 音高曲線產生..........................................................................................- 10 - 3.2.1 音高曲線預測-SVM......................................................................... - 10 - 3.2.2 音高曲線預測-Rule-based Equation............................................... - 14 - 3.3 合成單元調整..........................................................................................- 28 - 3.3.1 音高調整............................................................................................. - 28 - 3.3.2 音長調整............................................................................................. - 29 - 3.3.3 音量調整............................................................................................. - 30 - 3.4 合成進階處理..........................................................................................- 32 - 3.4.1 子音比例調整..................................................................................... - 32 - 3.4.2 合成單元串接..................................................................................... - 33 - 3.4.3 抖音..................................................................................................... - 35 - 3.4.4 回響..................................................................................................... - 36 - 第 4 章 實驗結果................................................................................................. - 37 - 4.1 音高曲線比較..........................................................................................- 37 - 4.2 聽測實驗..................................................................................................- 39 - 第 5 章 結論與展望............................................................................................. - 40 - 參考文獻................................................................................................................. - 41 -

    [1] Homer W. Dudley, The vocoder, Bell Laboratories record, 1939.
    [2] J. Makhoul, Linear prediction: A tutorial review, Proc. IEEE, Vol. 63,
    pp.561-580, 1975.
    [3] R. J. McAulay and T. Quatieri, Speech analysis/synthesis based on a
    sinusoidal representation, IEEE Transactions on Acoustics, Speech, and
    Signal Processing, vol. 34, pp.744-754, 1986.
    [4] Yi-Ru Wang, Vector Quantization of Pitch Information in Mandarin Speech,
    IEEE Transaction on Communications, Vol. 38, No. 9, 1990.
    [5] P. R. Cook., Identification of control parameters in an articulatory vocal tract
    model with applications to the synthesis of singing, PhD thesis, Stanford
    University, 1990.
    [6] H. Valbret and E. Moulines and J.P. Tubach, Voice transformation using
    PSOLA technique, Acoustics, Speech, and Signal Processing, 1992.
    ICASSP-92, 1992. IEEE International Conference, Vol. 1, 1992.
    [7] William H. Press, Numerical Recipes in C, The Art of Scientific Computing,
    Cambridge University Press, 1992.
    [8] W. Verhelst and M. Roelands, An overlap-add technique based on waveform
    similiarity (WSOLA) for high-quality time-scale modifications of speech,
    International Conference on Acoustics, Speech, and Signal Processing, 1993.
    [9] Fang-Wen Shaw, Synthesis of Chinese Songs, Master thesis, NCTU, 1994.
    [10] Ken. C. Pohlmann, Principles of Digital Audio, McGraw-Hill, New York, pp. 360,
    1995.
    [11] Sundberg J., The Human Singing Voice, Chapter 139 in Encyclopedia of
    Acoustics, Malcolm J. Crocker, Ed., pp.1687-1695., John Wiley and Sons, Inc.,
    1997.
    [12] M. W. Macon and L. Jensen-Link and J. Oliverio and M. Clements and E. B.
    George, Concatenation-based MIDI-to-singing voice synthesis, 103rd Meeting
    of the Audio Engineering Society, 1997.
    [13] Cheng-Yuan Lin, The Synthesis and Implementation of Mandarin Chinese
    Songs, Master thesis, NTHU, 2001
    [14] Jordi Bonada and Oscar Celma and Alex Loscos and Jaume Ortola and Xavier
    Serra, Singing Voice Synthesis Combining Excitation plus Resonance and
    Sinusoidal plus Residual Models, Pro. International Computer Music
    Conference, 2001.
    - 42 -
    [15] Xavier Rodet, Synthesis and Processing of the Singing Voice, 1st IEEE
    Benelux Workshop on Model based Processing and Coding of Audio
    (MPCA-2002), 2002.
    [16] Chih-Jen Lin, Training ν-Support Vector Regression: Theory and Algorithms,
    MIT Press Journals, Neural Computation, Vol. 14, No. 8, pp.1959-1977, 2002
    [17] Matthew E. Lee, Mark J. T. Smith, Digital Singing Voice Synthesis Using A New
    Alternating Reflection Model, Proc. ISCAS-2002, Vol. 2, pp.341-344, 2002.
    [18] Sheng-Szu Hao, Real-Time Singing Voice Synthesis System and Integration
    with the Instrument-Sound Synthesis, Master thesis, NTUST, 2002.
    [19] Jordi Bonada, Alex Loscos, Sample-based Singing Voice Synthesizer by
    Spectral Concatenation, Stockholm Music Acoustic Conference, 2003.
    [20] Mao-Yuan Hsu, A Study of Naturalness improvement for Mandarin Chinese
    Singing Voice Synthesis, Master thesis, NTHU, 2004
    [21] Tzu-Ying Lin, A Corpus-based Singing Voice Synthesis System for Mandarin
    Chinese, Master thesis, NTHU, 2005
    [22] Ying-Kae Tzeng , The Synthesis of Voice Signal, Master thesis, NTU, 2005
    [23] Huang-Liang Liao, Improving of Signal Quality for Mandarin Singing Voice
    Synthesis, Master thesis, NTUST, 2006
    [24] OGI CSLU Speech Syntheis Research Group, Flinger, Festival Singer, URL
    http://cslu.cse.ogi.edu/tts/flinger/
    [25] Yamaha Corporation Advanced System Development Center, New Yamaha
    VOCALOID Singing Synthesis Software Generates Superb Vocal on a PC,
    2003-2005, URL http://www.vocaloid.com/en/
    [26] http://gnese.free.fr/Projects/KaraokeTime/Fichiers/karfaq.html

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE