研究生: |
陳 全 Chen, Chuan |
---|---|
論文名稱: |
驗證深度學習軟體準確率於不同行動裝置穩定性之實證研究 Verify the Accuracy of Deep Learning Software on the Stability of Different Mobile Devices: An Empirical Study |
指導教授: |
邱銘傳
Chiu, Ming-Chuan |
口試委員: |
李昀儒
Lee, Yun-Ju 陳勝一 Chen, Sheng-I |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工業工程與工程管理學系碩士在職專班 Industrial Engineering and Engineering Management |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 41 |
中文關鍵詞: | 軟體品質 、自動化測試 、深度學習 、呼吸音 |
外文關鍵詞: | sound |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
軟體測試是為了評估軟體應用程式之功能與穩定性,確保功能有達到指定之需求且無任何問題,從而產生出高品質的產品,進一步推展自動化測試於大量且重複性的產品測試會有極高的效率,且對於效能、負載及壓力測試皆有助益,能降低人為錯誤與疏失,長遠來看甚至能大幅降低人力成本。
本研究將以建立深度學習軟體之自動化測試流程,驗證深度學習軟體之準確率於不同行動裝置之穩定性,隨機錄製多份吸吐氣之呼吸音檔,給予多位專業臨床醫療人員進行吸氣音標註,並與Linux©系統上之TensorFlow©及多種Android©行動裝置上之TensorFlow Lite©產出之推論資料進行Jaccard相似係數分析,藉由統計軟體之Minitab© 17版,進行One-way ANOVA檢定,於95%之信賴區間下,其P值為0.033,雖有顯著上差異,但經由混淆矩阵(Confusion Matrix)所得之TensorFlow©產生之推論資料準確率平均為99.1%,行動裝置之TensorFlow Lite©準確率平均皆為96.4%,皆有高達95%以上之準確率;同種行動裝置前後三次重複產生之推論資料皆為一致,而不同種類行動裝置之間之推論資料也皆為一致,顯示有極高的穩定性。
後續軟體或深度學習模型之改動,須滿足近乎於原始模型之準確率,且不同行動裝置間需維持一致之穩定性,以此為驗證標準與流程,並擴大驗證資料集,以達軟體品質之需求;品質滿足顧客需求,已成為現今社會衡量產品價值的最重要標準,在不斷的品質改善過程中創造顧客價值,能使企業邁向更高的層次。
The goal of software testing is to evaluate the function and stability of software applications such that the functions meet the specified requirements without any problems. To produce high-quality products, and to further promote automated testing in many repetitive product testing will be extremely high. It can reduce human error and negligence, and even greatly reduce labor costs in the long trem.
This study will establish an automated testing process for deep learning software to verify the accuracy of the stability of different mobile devices. Randomly record multiple breathing sound audio files and give them to several professional clinical medical personnel to annotate. Jaccard similarity coefficient analysis was performed with the inference data produced by TensorFlow© on Linux© system and TensorFlow Lite© on various Android mobile devices. Under the confidence interval of 95%, the P-value is 0.033. Although there is a significant difference, the average accuracy of TensorFlow© is 99.1% by Confusion Matrix. And the average accuracy of TensorFlow Lite© on mobile devices is 96.4%. The accuracy rate all of more than 95%. The inference data generated by the same mobile device before and after three repetitions are all consistent.
Subsequent changes to the software or deep learning models must meet the accuracy rate close to the original model and maintain consistent stability between different mobile devices. This is the verification standard and process, and the data set should be expanded to achieve the highest software quality.
英文文獻
1. Amiriparian, S., Gerczuk, M., Ottl, S., Cummins, N., Freitag, M., Pugachevskiy, S., Baird, A., & Schuller, B. (2017). Snore Sound Classification Using Image-Based Deep Spectrum Features. Proc. Interspeech 2017, 3512-3516, doi: 10.21437/Interspeech.2017-434.
2. Bishop, D. (2018). How to Build a Climate of Quality in a Small to Medium Enterprise: An Action Research Project. Department of Management Muma College of Business University of South Florida, A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Business Administration.
International Journal of Lean Six Sigma, Vol. ahead-of-print No. ahead-of-print.
3. Chetouane, N., Klampfl, L., & Wotawa, F. (2019). Investigating the Effectiveness of Mutation Testing Tools in the Context of Deep Neural Networks. Springer, volume 11506 of Lecture Notes in Computer Science, 766–777. doi:10.1007/978-3-030-20521-8_63.
4. Glangetas, A., Hartley, M.-A., Cantais, A., Courvoisier, D. S., Rivollet, D., Shama, D. M., Perez, A., Spechbach, H., Trombert, V., Bourquin, S., Jaggi, M., Barazzone‑Argiroffo, C., Gervaix1, A., & Siebert, J. N. (2021). Deep learning diagnostic and risk-stratification pattern detection for COVID-19 in digital lung auscultations: clinical protocol for a case–control and prospective cohort study. BMC Pulmonary Medicine, 21(1). doi:10.1186/s12890-021-01467-w
5. Guichard, J., Ruane, E., Smith, R., Bean, D., & Ventresque, A. (2019). Assessing the Robustness of Conversational Agents using Paraphrases. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). doi:10.1109/aitest.2019.000-7
6. Hafke-Dys, H., Kuźnar-Kamińska, B., Grzywalski, T., Maciaszek, A., Szarzyński, K., & Kociński, J. (2021). Artificial Intelligence Approach to the Monitoring of Respiratory Sounds in Asthmatic Patients. Frontiers in Physiology, DOI: 10.3389/fphys.2021.745635
7. Hand, D. J., & Khan, S. (2020). Validating and Verifying AI Systems. Patterns, 1(3), 100037. doi: 10.1016/j.patter.2020
8. Hsu, F., Huang, C., Kuo, C., Huang, S., Cheng, Y., Wang, J., Wu, Y., Tzeng, T., & Lai, F. (2021). Development of a Respiratory Sound Labeling Software for Training a Deep Learning-Based Respiratory Sound Analysis Model. arXiv-CS-Sound, doi: arxiv -2101.01352
9. Kevat, A., Kalirajah, A., & Roseby, R. (2020). Artificial intelligence accuracy in detecting pathological breath sounds in children using digital stethoscopes. Respiratory Research, 21(1). doi:10.1186/s12931-020-01523-9
10. Kim, Y., Hyon, Y., Jung, S. S., Lee, S., Yoo, G., Chung, C., & Ha, T. (2021). Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning. Scientific Reports, 11(1). doi:10.1038/s41598-021-96724-7.
11. Koroglu, Y., & Wotawa, F. (2019). Fully Automated Compiler Testing of a Reasoning Engine via Mutated Grammar Fuzzing. 2019 IEEE/ACM 14th International Workshop on Automation of Software Test (AST). doi:10.1109/ast.2019.00010.
12. Kuhn, D. R., Bryce, R., Duan, F., Ghandehari, L. S., Lei, Y., & Kacker, R. N. (2015). Combinatorial Testing: Theory and Practice. In Advances in Computers, volume 99, 1–66.
13. Li, Y., Tao, J., & Wotawa, F. (2019). Ontology-based Test Generation for Automated and Autonomous Driving Functions. Information and Software Technology, 106200. doi: 10.1016/j.infsof.2019.
14. Marick, B. (1998). When Should a Test Be Automated. Engineering.
15. Palaniappan, R., Sundaraj, K., & Sundaraj, S. (2014). Artificial intelligence techniques used in respiratory sound analysis – a systematic review. Biomedizinische Technik/Biomedical Engineering, 59(1). doi:10.1515/bmt-2013-0074
16. Qian, K., Janott, C., Pandit, V., Zhang, Z., Heiser, C., Hohenhorst, W., Herzog, M., Hemmert, W., & Schuller, B. (2017). Classification of the Excitation Location of Snore Sounds in the Upper Airway by Acoustic Multifeature Analysis. IEEE Transactions on Biomedical Engineering, 64(8), 1731–1741. doi:10.1109/tbme.2016.2619
17. Sekhon, J., & Fleming, C. (2019). Towards Improved Testing For Deep Learning. arXiv:1902.06320
18. Senthilnathan, H., Deshpande, P., & Rai, B. (2020). Breath Sounds as a Biomarker for Screening Infectious Lung Diseases. Engineering Proceedings, 2(1), 65
19. Sommerville, I. (2015). Software Engineering (10th ed.). Pearson, chapter 24
20. Wang, J., Stromfeli, H., & Schuller, B. W. (2018). A Cnn-Gru Approach to Capture Time-Frequency Pattern Interdependence for Snore Sound Classification. 2018 26th European Signal Processing Conference (EUSIPCO). doi:10.23919/eusipco.2018.
21. Wong, W. Y., Wen Yu, S., & Too, C. W. (2018). A Systematic Approach to Software Quality Assurance: The Relationship of Project Activities within Project Life Cycle and System Development Life Cycle. 2018 IEEE Conference on Systems, Process and Control (ICSPC). doi:10.1109/spc.2018.87039
22. Wotawa, F. (2021). On the Use of Available Testing Methods for Verification & Validation of AI-based Software and Systems. CEUR-WS.org, Vol-2808, Artificial Intelligence Safety 2021.
中文文獻
1. Soto, J. F. R. (2013)。嵌入式裝置自動測試機制之準確度、效率與重複使用度研究。國立交通大學網路工程研究所,碩士論文。
2. 王光奇(2017)。安卓應用程式自動化品質測試。國立台灣大學電機資訊學院電機工程學研究所,碩士論文。
3. 李蕙君(2012)。Android應用程式自動化測試研究。國立台灣大學電機資訊學院資訊工程學系,碩士論文。
4. 郭柏廷(2013)。Android應用程式之GUI自動化測試方法。國立台北科技大學,碩士論文。
5. 張唯霖(2015)。利用影響辨識方法之應用程式自動化測試技術。國立台灣大學電機資訊學院電子工程學研究所,碩士論文。
6. 陳錫民(2020)。使用分散式架構提升自動化程式動態評量效能之研究。逢甲大學資訊工程學系,碩士論文。
7. 鄭大容(2020)。利用深度學習技術自動化決定畫面元素主題。國立台灣大學電機資訊學院電子工程學研究所,碩士論文。
8. 蔣汶宏(2015)。安卓系統應用程式之自動化測試框架。國立台灣大學電機資訊學院電機工程學研究所,碩士論文。
網路資源
1. Kehoe, R. & Jarvis, A. (1996). ISO 9000-3: A Tool for Software Product and Process Improvement. Springer Publishing Company, Incorporated.
(https://books.google.com.tw/books?id=i0LhBwAAQBAJ&pg=PP5&lpg=PP5&dq#v=onepage&q&f=false)