研究生: |
蔡立篁 Tsai, Li-Huang |
---|---|
論文名稱: |
元批標準化: 增進模型在跨領域資料分布之泛化能力 Meta BatchNorm: Enhancing Model Generalization for Cross-Domain Data Distributions |
指導教授: |
張世杰
Chang, Shih-Chieh |
口試委員: |
陳縕儂
Chen, Yun-Nung 何宗易 Ho, Tsung-Yi |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 24 |
中文關鍵詞: | 批標準化 、深度學習 、跨領域分布 |
外文關鍵詞: | Batchnorm, Deep Learning, Cross-Fomain Distribution |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
雖然深度神經網路已經取得了很大的成功,但由於訓練和測試數據或現實世界的輸入擾動造成的跨域泛化不足是一個關鍵問題。當測試樣本來自與訓練數據集不同的分佈時,DNN的性能可能會下降。為瞭解決跨域泛化問題,前人提出了數據增強技術,通過擴大訓練數據分佈來縮小訓練和測試數據分佈之間的差距,從而提高測試時間性能。在本文中,我們提出了一個變體的批標準化稱為''元批標準化",它學習一個輔助網路,用於在測試時動態調整Batch Normalization層的仿射參數。
實驗結果表明,通過適當調整仿射參數,對特徵分佈進行了逐層校正,校正後的模型在CIFAR-10-C和Tiny-ImageNet-C上比普通模型分別提高了9.7\%和12.7\%。並通過CAM可視化實例的研究表明,使用我們的方法訓練的模型可以在嘈雜環境下聚焦於圖像的顯著特徵。
Though deep neural networks have achieved great success, the lack of cross-domain generalization caused by training and testing data or input perturbation in the real world is a critical problem. The performance of a DNN may drop when test examples are drawn from distributions different from that of the training dataset. To resolve cross-domain generalization problem, several previous works proposed data augmentation techniques which improved the testing time performance by widening the training data distribution to narrow the gap between training and testing data distributions. In this paper, we propose a variant of Batch Normalization layer called "Meta Batch Normalization", which learns an auxiliary network to dynamically adjust the affine parameters scale and bias of the Batch Normalization layers at testing time. Due to the proper adjustment on the affine parameters, our experimental result shows that the feature distributions are calibrated layer by layer, and these calibrated models achieve improvements of 9.7% and 12.7% on CIFAR-10-C and Tiny-ImageNet-C, compared to the vanilla model. Investigation using CAM visualization examples shows that the model trained using our method can focus on the salient features of an image under a noisy environment.
[1] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le. Randaugment: Practical automateddata augmentation with a reduced search space, 2019.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Visionand Pattern Recognition, pages 248–255, 2009.
[3] T. DeVries and G. W. Taylor. Improved regularization of convolutional neural net-works with cutout, 2017.
[4] J. Frankle, D. J. Schwab, and A. S. Morcos. Training batchnorm and only batch-norm: On the expressive power of random features in cnns, 2021.
[5] Y. Guo, Q. Wu, C. Deng, J. J. Chen, and M. Tan. Double forward propagation formemorized batch normalization. InAAAI, 2018.
[6] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition,2015.
[7] D. Hendrycks and T. G. Dietterich. Benchmarking neural network robustness tocommon corruptions and surface variations, 2019.
[8] D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan.Augmix: A simple data processing method to improve robustness and uncertainty,2020.
[9] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network trainingby reducing internal covariate shift, 2015.[10] A. Krizhevsky. Learning multiple layers of features from tiny images. Technicalreport, 2009.
[11] Y. Le and X. Yang. Tiny imagenet visual recognition challenge. 2015.
[12] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpa-thy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large ScaleVisual Recognition Challenge.International Journal of Computer Vision (IJCV),115(3):211–252, 2015.
[13] S. Singh and A. Shrivastava. Evalnorm: Estimating batch normalization statisticsfor evaluation, 2019.
[14] C. Summers and M. J. Dinneen. Four things everyone should know to improve batchnormalization, 2019.
[15] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo. Cutmix: Regularizationstrategy to train strong classifiers with localizable features, 2019.
[16] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. mixup: Beyond empiricalrisk minimization, 2018.
[17] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep featuresfor discriminative localization, 2015.