簡易檢索 / 詳目顯示

研究生: 張力偉
Chang, Li-Wei
論文名稱: 以zero shot為基礎之音訊超解析模型
Audio Super Resolution using Zero-Shot Neural Network
指導教授: 蘇豐文
Soo, Von-Wun
口試委員: 邱瀞德
Chiu, Ching-Te
沈之涯
Shen, Chih-Ya
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 22
中文關鍵詞: 音訊處理深度學習超解析
外文關鍵詞: Audio signal, Super resolutiion
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本篇論文中,我們設計了一種方法來進行音頻超分辨率,該方法使用輕量級神經網絡模型(以真實音樂專輯作為訓練數據)來提高採樣率。 我們的模型的靈感來自zero-shot學習的思想和原則。 它可以快速且成功地利用低分辨率信號來重建高頻訊號。 在實驗中,我們允許2倍(2x),4倍(4x)和6倍(6x)上採樣率。 此外,該模型可以允許在相似的環境條件下(例如,在同一專輯中的同一歌手)將一首音訊的訓練模型直接套用到其他音訊上。


    In this paper we design a method to conduct audio super resolution that increases the sampling rate with a lightweight neural networks model using the real music album as training data. Our model is inspired by the ideas and principles from zero-shot learning. It can reconstruct the low-resolution signal quickly and successfully with the high frequency structure. In the experiments, we allow 2 times (2x), 4 times (4x) and 6 times (6x) up-sampling rates. The model can be applied to the audio in which the signals are recorded under similar recording environment and conditions such as the same singer in the same album, for example.

    Abstract------------------------------------------------------2 Introduction--------------------------------------------------5 Backgrounds---------------------------------------------------6 Audio Signal Sampling-------------------------------------6 Sampling Rate---------------------------------------------6 Sampling Theorem------------------------------------------7 Bit Depth-------------------------------------------------7 Time Domain-----------------------------------------------8 Frequency Domain------------------------------------------8 Time-Frequency Domain------------------------------------10 Re-sample------------------------------------------------10 Down-sampling----------------------------------------10 Up-sampling------------------------------------------11 Problems In Up-sampling------------------------------12 Related Work---------------------------------------------13 Audio Super Resolution with neural networks----------13 Zero-shot Super Resolution---------------------------13 Method & Setup-----------------------------------------------14 Problem Describe-----------------------------------------14 Flow-----------------------------------------------------14 Adapting to the preprocessing------------------------14 Internal Information-------------------------------------15 Model----------------------------------------------------16 Architecture---------------------------------------------17 Process--------------------------------------------------18 Experiments & Results----------------------------------------18 Data-----------------------------------------------------18 Measure Methods------------------------------------------19 Settings-------------------------------------------------19 Evaluation-----------------------------------------------19 Extra----------------------------------------------------20 Another dataset--------------------------------------20 Result-----------------------------------------------20 A generic model--------------------------------------20 Method-----------------------------------------------20 Result-----------------------------------------------21 Discussion---------------------------------------------------21 Advantage------------------------------------------------21 Limitations----------------------------------------------21 Other training method.-----------------------------------22 Conclusion---------------------------------------------------22 References---------------------------------------------------22

    [1] Volodymyr Kuleshov, S. Zayd Enam, and Stefano Ermon : AUDIO SUPER-RESOLUTION USING NEURAL NETS

    [2] Teck Yian Lim, Raymond A. Yeh, Yijia Xu, Minh N. Do, Mark Hasegawa-Johnson : TIME-FREQUENCY NETWORKS FOR AUDIO SUPER-RESOLUTION

    [3] Sung Kim, Visvesh Sathe : ADVERSARIAL AUDIO SUPER-RESOLUTION WITH UNSUPERVISED FEATURE LOSSES

    [4] Assaf Shocher, Nadav Cohen†, Michal Irani : “Zero-Shot” Super-Resolution using Deep Internal Learning

    QR CODE