神經保險絲:在低電壓環境下提升具存取限制的神經網路在測試時的準確率

簡易檢索 / 詳目顯示

回結果列表

研究生：	孫浩倫 Sun, Hao-Lun
論文名稱：	神經保險絲:在低電壓環境下提升具存取限制的神經網路在測試時的準確率 NeuralFuse: Improving the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
指導教授：	何宗易 Ho, Tsung-Yi
口試委員:	李淑敏 Li, Shu-Min 游家牧 Yu, Chia-Mu
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2022
畢業學年度：	111
語文別：	英文
論文頁數：	57
中文關鍵詞：	深度神經網路、神經網路加速器、低電壓、節約能源
外文關鍵詞：	Deep Neural Network, DNN Accelerator, Low Voltage, Energy Saving
相關次數：	點閱：221 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度神經網路是最先進的模型之一且已被採用於許多基於機器學習的系統與算法中。然而，深度神經網路的一個顯著問題是它們用於訓練和測試時的大量能量消耗。在硬體層面上，一個節能當前的做法就是降低在測試階段上供給深度神經網路加速器的電壓。然而，在低電壓的環境下會使得儲存在記憶體當中的模型權重產生隨機的機位錯誤，因此會降低模型的效能。為了解決這個挑戰，我們提出了神經保險絲，一種新穎的輸入轉型技巧作為附加模組來保護模型不會在低電壓的環境下而產生劇烈的準確率下降。有了神經保險絲，我們可以在不需要重新訓練模型的情況下去降低能量跟準確率之間的取捨，且神經保險絲可以快速應用在存取限制的深度神經網路上。例如，深度神經網路在不可配置的硬體元件上，或者深度神經網路在遠端連結的雲端應用程式介面上。在跟沒有被保護的深度神經網路的比較下，我們的實驗結果展現了神經保險絲可以降低高達24%的記憶體存取能量消耗且同時在此低電壓的環境下可以提升高達57%的準確率。據我們所知，這是第一篇跟模型無關的方法(如不需要重新訓練模型)使得在低電壓的環境下減緩準確率跟能量之間的取捨。

Deep neural networks (DNNs) are state-of-the-art models adopted in many machine learning based systems and algorithms. However, a notable issue of DNNs is their considerable energy consumption for training and inference. At the hardware level, one current solution to energy saving at the inference phase is to reduce the voltage supplied to the DNN hardware accelerator. However, operating in the low-voltage regime would induce random bit errors saved in the memory and thereby degrade the model performance. To address this challenge, we propose NeuralFuse, a novel input transformation technique as an add-on module, to protect the model from severe accuracy drops in low-voltage regimes. With NeuralFuse, we can mitigate the tradeoff between energy and accuracy without retraining the model, and it can be readily applied to DNNs with limited access, such as DNNs on non-configurable hardware or remote access to cloud-based APIs. When compared with unprotected DNNs, our experimental results show that NeuralFuse can reduce memory access energy up to 24% and simultaneously improve the accuracy in low-voltage regimes up to an increase of 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to reducing the accuracy-energy tradeoff in low-voltage regimes.

Abstract (Mandarin) I
Abstract II
Acknowledgements (Mandarin) III
Contents IV
List of Figures VII
List of Tables IX
List of Algorithms XII
Introduction 1
Related Work and Background 6
1 Software Based Energy Saving Strategies . . . . . . . . . . . . . . . . . 6
2 Hardware Based Energy Saving Strategies . . . . . . . . . . . . . . . . 7
3 Memory bit errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
NeuralFuse: Methodology and Algorithms 10
1 Error-Resistant Input Transformation . . . . . . . . . . . . . . . . . . . 10
2 Training Objective and Optimizer . . . . . . . . . . . . . . . . . . . . . 11
3 Training Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Experiments 14
1 Experiment Setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Base Model Details . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Details for Generator . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Energy Consumption Calculation . . . . . . . . . . . . . . . . . 15
1.5 Relaxed and Restricted Access Settings . . . . . . . . . . . . . . 16
1.6 Computing Resources and Training Process . . . . . . . . . . . 16
2 Performance Evaluation on Relaxed Access . . . . . . . . . . . . . . . . 17
3 Transferability for Restricted Access . . . . . . . . . . . . . . . . . . . . 19
4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Additional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1 Efficiency Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Study for N in EOPM: . . . . . . . . . . . . . . . . . . . . . . . 25
Conclusion and FutureWorks 27
1 NeuralFuse to Other Applications in Low-Voltage Regime . . . . . . . . 28
2 Input Transformation in Text to Speech . . . . . . . . . . . . . . . . . . 28
3 Input Transformation in AI Fairness . . . . . . . . . . . . . . . . . . . . 31
Appendix 33
1 Implementation Details for NeuralFuse Generator . . . . . . . . . . . . 33
2 Implementation Details for SCALE-SIM . . . . . . . . . . . . . . . . . . 36
3 Energy-Accuracy Tradeoff with 1% Random Bit Error Rate for More
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Details for Model Parameters and MACs Value . . . . . . . . . . . . . . 41
5 Additional Experimental Results on Relaxed Access . . . . . . . . . . . 43
6 Additional Experimental Results on Transferability . . . . . . . . . . . 46
7 Additional Experiments on Adversarial Training . . . . . . . . . . . . . 52
Bibliography 54
                                

[1] Z. Liu, P. Luo, X.Wang, and X. Tang, “Deep learning face attributes in the wild,” in
Proceedings of International Conference on Computer Vision (ICCV), pp. 3730–3738,
2015.
[2] B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee, S. K. Lee, J. M. Hernández-
Lobato, G.-Y. Wei, and D. Brooks, “Minerva: Enabling low-power, highly-accurate
deep neural network accelerators,” in 2016 ACM/IEEE 43rd Annual International
Symposium on Computer Architecture (ISCA), pp. 267–278, 2016.
[3] N. Chandramoorthy, K. Swaminathan, M. Cochet, A. Paidimarri, S. Eldridge, R. V.
Joshi, M. M. Ziegler, A. Buyuktosunoglu, and P. Bose, “Resilient low voltage
accelerators for high energy efficiency,” in 2019 IEEE International Symposium on
High Performance Computer Architecture (HPCA), pp. 147–158, 2019.
[4] D. Stutz, N. Chandramoorthy, M. Hein, and B. Schiele, “Bit error robustness for
energy-efficient dnn accelerators,” in Proceedings of Machine Learning and Systems
(MLSYS), vol. 3, pp. 569–598, 2021.
[5] S. Kim, P. Howe, T. Moreau, A. Alaghi, L. Ceze, and V. Sathe, “Matic: Learning
around errors for efficient low-voltage neural network accelerators,” in 2018 Design,
Automation & Test in Europe Conference & Exhibition (DATE), pp. 1–6, IEEE, 2018.
[6] S. Koppula, L. Orosa, A. G. Yağlıkçı, R. Azizi, T. Shahroodi, K. Kanellopoulos,
and O. Mutlu, “Eden: Enabling energy-efficient, high-performance deep neural network inference using approximate dram,” in Proceedings of the 52nd Annual
IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 166–181,
2019.
[7] Y. Gong, L. Liu, M. Yang, and L. Bourdev, “Compressing deep convolutional
networks using vector quantization,” arXiv preprint arXiv:1412.6115, 2014.
[8] J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, “Quantized convolutional neural
networks for mobile devices,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 4820–4828, 2016.
[9] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification
using binary convolutional neural networks,” in European Conference on
Computer Vision (ECCV), vol. 9908, pp. 525–542, 2016.
[10] T.-J. Yang, Y.-H. Chen, and V. Sze, “Designing energy-efficient convolutional neural
networks using energy-aware pruning,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 5687–5695, 2017.
[11] H. Yang, Y. Zhu, and J. Liu, “Ecc: Platform-independent energy-constrained deep
neural network compression via a bilinear regression model,” in Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11206–
11215, 2019.
[12] H. Yang, Y. Zhu, and J. Liu, “Energy-constrained compression for deep neural
networks via weighted sparse projection and layer input masking,” in International
Conference on Learning Representations (ICLR), 2019.
[13] H. Yang, S. Gui, Y. Zhu, and J. Liu, “Automatic neural network compression by
sparsity-quantization joint learning: A constrained optimization-based approach,”
in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 2178–2188, 2020.
[14] G. Srinivasan, P. Wijesinghe, S. S. Sarwar, A. Jaiswal, and K. Roy, “Significance
driven hybrid 8t-6t sram for energy-efficient synaptic storage in artificial neural
networks,” in 2016 Design, Automation & Test in Europe Conference & Exhibition
(DATE), pp. 151–156, 2016.
[15] S. Ganapathy, J. Kalamatianos, K. Kasprak, and S. Raasch, “On characterizing
near-threshold sram failures in finfet technology,” in 2017 54th ACM/EDAC/IEEE
Design Automation Conference (DAC), pp. 1–6, 2017.
[16] A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, “Synthesizing robust adversarial
examples,” in International Conference on Machine Learning (ICML), vol. 80, pp. 284–
293, 2018.
[17] A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny
images,” 2009.
[18] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “Man vs. computer: Benchmarking
machine learning algorithms for traffic sign recognition,” Neural Networks,
vol. 32, pp. 323–332, 2012.
[19] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” in International Conference on Learning Representations (ICLR),
2015.
[20] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–
778, 2016.
[21] T. A. Nguyen and A. Tran, “Input-aware dynamic backdoor attack,” Advances in
Neural Information Processing Systems (NeurIPS), vol. 33, pp. 3454–3464, 2020.
[22] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for
biomedical image segmentation,” in International Conference on Medical Image
Computing and Computer-Assisted Intervention (MICCAI), vol. 9351, pp. 234–241,
2015.
[23] A. Samajdar, J. M. Joseph, Y. Zhu, P. Whatmough, M. Mattina, and T. Krishna,
“A systematic methodology for characterizing scalability of dnn accelerators using
scale-sim,” in 2020 IEEE International Symposium on Performance Analysis of
Systems and Software (ISPASS), pp. 58–68, 2020.
[24] L. van der Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of Machine
Learning Research (JMLR), vol. 9, pp. 2579–2605, 2008.
[25] K. Kumar, R. Kumar, T. de Boissiere, L. Gestin,W. Z. Teoh, J. Sotelo, A. de Brébisson,
Y. Bengio, and A. C. Courville, “Melgan: Generative adversarial networks for
conditional waveform synthesis,” in Advances in Neural Information Processing
Systems (NeurIPS), vol. 32, 2019.
[26] Z.Wang, X. Dong, H. Xue, Z. Zhang, W. Chiu, T.Wei, and K. Ren, “Fairness-aware
adversarial perturbation towards bias mitigation for deployed deep models,” in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 10379–10388, 2022.
[27] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial
perturbations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 1765–1773, 2017.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文