基於深度學習之程式除錯與修復實證研究｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	林子揚 Lin, Tzu-Yang
論文名稱：	基於深度學習之程式除錯與修復實證研究 An Empirical Investigation of Using Deep Learning-Based Approaches for Program Debugging and Repair
指導教授：	黃慶育 Huang, Chin-Yu
口試委員:	蘇銓清林振緯林其誼
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	99
中文關鍵詞：	自動程式修復、搜尋基底修復、基因編程、深度學習、深度置信網路
外文關鍵詞：	automatic program repair, search-based repair, genetic programming, deep learning, deep brief network
相關次數：	點閱：4 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

軟體系統開發生命週期(System Development Life Cycle, SDLC)可分為規劃、設計、開發、維護等階段，而綜觀整個週期，維護需要付出最多的努力與資源，這也是軟體品質保證至關重要的原因。在軟體品質保證這個領域中，自動程式修復是一項重要的研究項目，如果沒有自動程式修復，維護人員將會花費大量資源和時間試圖找出錯誤發生的位置，不幸的是，手動修復可能會導致必要功能的缺失或產生新的潛在錯誤，整個程式修復過程將是極為消耗資源且不穩定的。相較之下，自動程式修復使程式的維護更加便利，當發現錯誤時，自動程式修復系統會使用錯誤預測的技術找出錯誤的位置，接著使用搜尋修復法(search-based repair)或語義修復法(semantics-based repair)嘗試修復錯誤，最後執行測試個案以確保正確排除錯誤且程式功能不減。有了自動程式修復，程式除錯的過程將是便捷且穩定的，在高度仰賴資訊系統的現代社會，自動程式修復的重要性不言而喻。
自動程式修復最主要的困難在於冗長的執行時間，執行時間之中測試時間為最主要的部份。根據文獻頂尖的自動程式修復技術GenProg，在16支目標程式的修復過程中，有62.75%的執行時間是花費在測試上。隨著程式規模日益增大，測試個案數量愈來愈多，測試時間已經成為自動程式修復的主要瓶頸之一，同時也是這篇論文嘗試解決的問題。
這篇論文提出DLBGP(deep learning based GenProg)，一種結合深度學習與自動程式修復的技術。這項技術從程式的抽象語法樹(AST)提取節點向量，並使用深度置信網路(DBN)提取語意特徵，最後利用語意特徵預測修復候選人是否正確。深度學習能夠快速處理大量資料的特性，幫助自動程式修復系統快速判斷修復候選人的品質，進而節省執行測試個案的時間。我們沒有使用複雜的深度學習模型，但是我們依然取得明顯的實驗結果。在IntroClass資料庫的10支目標檔案中，我們提出的方法產生的修復結果與GenProg通過相同數目測試個案，並且能夠節省64%的執行時間。綜上所述，這篇論文提出一個快速有效的深度學習程式修復法，對自動程式修復領域有很大的幫助。

Software System Development Life Cycle, SDLC, can be divided into plan, design, develop, and maintain. Among the cycle, maintenance takes most effort. That is the reason why software quality assurance is essential. In the field of software quality assurance, automatic program repair is an important topic. If there is no automatic program repair, maintenance personnel will spend a lot of resources and time trying to find out where the error occurred. Unfortunately, manual repair may cause the lack of original functions or generate new potential errors. The entire program repair process will be extremely resource intensive and unstable. In contrast, automatic program repair makes program maintenance easier. When an error is found, the automatic program repair system uses the technique of fault prediction to find the location of the error, and then uses search-based repair or semantic-based repair to try to fix the error. Finally, the program repair system executes the testcases to ensure that errors are correctly repaired and the program functionality is not reduced. With automatic program repair, the process of debugging will be convenient and stable. In the modern society, which relies heavily on information systems, the importance of automatic program repair is self-evident.
The main difficulty in automatic program repair is the lengthy execution time. Testing time is the most important part of execution time. For example, the state-of-the-art automatic program repair technology GenProg spent 62.75% of the execution time on testing for 16 target programs, which is described in its paper. With the increasing scale of the program and the increasing number of testcases, testing time has become one of the main bottlenecks of automatic program repair, and it is also the problem that this paper attempts to solve.
This paper proposes Deep Learning Based GenProg (DLBGP), which is a technology that combines deep learning with automatic program repair. Our method extracts node vectors from the program's Abstract Syntax Tree (AST) and extracts semantic features using a Deep Brief Network (DBN). After that, our method uses semantic features to predict whether the repair candidates are correct. The ability to classify lots of data quickly is a feature of deep learning. This feature helps our method to evaluate the quality of candidate repairs and to save testing time. We didn’t use complex deep learning models, but we still achieved obvious experimental results. For 10 subject files in the IntroClass database, our proposed model DLBGP reduced 64% of execution time compared to GenProg while passing same number of testcases. In summary, this paper proposes a fast, effective, and easy-to-install deep learning program repair method, which is very helpful for the field of automatic program repair.

Abstract in Chinese..............................................i
Abstract.......................................................iii
Contents.........................................................v
List of Tables.................................................vii
List of Figures.................................................ix
Notation.........................................................x
Chapter 1    Introduction.......................................11
Chapter 2    Related Work.......................................17
2.1    Semantic-based repair....................................17
2.2    Search-based repair......................................18
Chapter 3    Deep Learning Based GenProg........................24
3.1    Overview of deep learning based GenProg..................24
3.2    Sanity Check.............................................26
3.3    Program Representation...................................27
3.4    Selecting Candidates.....................................31
3.5    Crossover................................................32
3.6    Mutation.................................................33
3.7    Fitness Evaluation.......................................37
Chapter 4    Experiments........................................45
4.1    Experimental Setup.......................................45
4.2    Repair Results...........................................49
4.3    Sensitivity Analysis.....................................53
4.4    Research Questions.......................................58
4.5    Threats to Validity......................................64
4.6    Our method’s feedback....................................66
Chapter 5    Conclusions and Future Work........................68
Reference.......................................................70
Appendix........................................................77
Appendix A. Our implementation of GenProg.......................77
Appendix B. Our implementation of AE............................77
Appendix C. Subject files used in experiment from IntroClass....79
Appendix D. Subject files used in experiment from ITSP..........90
                                

[1] E. Barnett, “Gmail outage affected majority of users, says Google,” Telegraph Media Group, 2 Sep. 2009. [Online]. Available: https://www.telegraph.co.uk/technology/google/6125689/Gmail-outage-affected-majority-of-users-says-Google.html. [Accessed: 3 May 2019].
[2] W. Leonhard, “Hotmail fail: Microsoft lays an egg in the cloud,” IDG Communications, Inc, 5 Jan. 2011. [Online]. Available: https://www.infoworld.com/article/2624887/saas/hotmail-fail--microsoft-lays-an-egg-in-the-cloud.html. [Accessed: 3 May 2019].
[3] D. Perry, “Microsoft and Amazon Cloud Services Struck by Lightning,” Tom's Guide, 9 Aug. 2011. [Online]. Available: https://www.tomsguide.com/us/amazon-ec2-microsoft-cloud-services-outage,news-12108.html. [Accessed: 3 May 2019].
[4] C. Williams, “Millions of Hotmail users cut off by Microsoft 'cloud' failure,” Telegraph Media Group, 9 Sep. 2011. [Online]. Available: http://www.telegraph.co.uk/technology/news/8752156/Millions-of-Hotmail-users-cut-off-by-Microsoft-cloud-failure.html. [Accessed: 5 May 2019].
[5] R. C. Scacord, D. Plakosh and G. A. Lewis, Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices., Boston, USA: Addison-Wesley Professional, 2003.
[6] M. Jorgensen and M. Shepperd, “A Systematic Review of Software Development Cost Estimation Studies,” IEEE Transactions on Software Engineering, vol. 33, no. 1, pp. 33-53, Jan. 2007.
[7] J. Sutherland, “The Object Technology Architecture: Business Objects for Corporate Information Systems,” in Business Object Design and Implementation, London, 1997.
[8] Z. Yu, M. Martinez, B. Danglot, T. Durieux and M. Monperrus, “Test Case Generation for Program Repair: A Study of Feasibility and Effectiveness,” arXiv, 2017.
[9] C. L. Geoues, T. Nguyen, S. Forrest and W. Weimer, “GenProg: A Generic Method for Automatic Software Repair,” IEEE Transactions on Software Engineering, vol. 38, no. 1, pp. 54-72, 2012.
[10] M. Mossige, A. Gotlieb and H. Meling, “Using CP in Automatic Test Generation for ABB Robotics’ Paint Control System,” in Principles and Practice of Constraint Programming, Lyon, France, Springer, Cham, 2014, pp. 25-41.
[11] A. Gotlieb and D. Marijan, “FLOWER: optimal test suite reduction as a network maximum flow,” in Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014), San Jose, CA, USA, 2014.
[12] S. Wang, S. Ali and A. Gotlieb, “Cost-effective test suite minimization in product lines using search techniques,” Journal of Systems and Software, vol. 103, no. C, pp. 370-391, 2015.
[13] A. Gotlieb, M. Carlsson, D. Marijan and A. Petillon, “A New Approach to Feature-based Test Suite Reduction in Software Product Line Testing,” in 11th International Conference on Software Engineering and Applications (ICSOFT-EA 2016), Lisbon, Portugal, 2016.
[14] H. Spieker, A. Gotlieb, D. Marijan and M. Mossige, “Reinforcement learning for automatic test case prioritization and selection in continuous integration,” in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017), Santa Barbara, CA, USA, 2017.
[15] D. Marijan, M. Liaaen, A. Gotlieb, S. Sen and C. Leva, “TITAN: Test Suite Optimization for Highly Configurable Software,” in 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST), Tokyo, Japan, 2017.
[16] M. Mossige, A. Gotlieb, H. Spieker, H. Meling and M. Carlsson, “Time-aware Test Case Execution Scheduling for Cyber-Physical Systems,” in 23rd International Conference on Principles and Practice of Constraint Programming (CP 2017), Melbourne, Australia, 2017.
[17] C. L. Goues, N. Holtschulte, E. K. Smith, Y. Brun, P. Devanbu, S. Forrest and W. Weimer, “The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs,” IEEE Transactions on Software Engineering, vol. 41, no. 12, pp. 1236-1256, 2015.
[18] H. D. T. Nguyen, D. Qi, A. Roychoudhury and S. Chandra, “SemFix: Program repair via semantic analysis,” in 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 2013.
[19] J. Xuan, M. Martinez, F. DeMarco, M. Clement, S. L. Marcote, T. Durieux, D. L. Berre and M. Monperrus, "Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs," IEEE Transactions on Software Engineering, vol. 43, no. 1, pp. 34-55, 2017.
[20] S. Mechtaev, J. Yi and A. Roychoudhury, “DirectFix: Looking for Simple Program Repairs,” in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Florence, Italy, 2015.
[21] S. Mechtaev, J. Yi and A. Roychoudhury, “Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis,” in 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, USA, 2016.
[22] W. Weimer, T. Nguyen, C. L. Goues and S. Forrest, “Automatically finding patches using genetic programming,” in 2009 IEEE 31st International Conference on Software Engineering, Vancouver, BC, Canada, 2009.
[23] D. Pierret and D. Poshyvanyk, “An empirical exploration of regularities in open-source software lexicons,” in 2009 IEEE 17th International Conference on Program Comprehension (ICPC), Vancouver, BC, Canada, 2009.
[24] M. Gabel and Z. Su, “A study of the uniqueness of source code,” in Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering (FSE'10), Santa Fe, New Mexico, USA, 2010.
[25] A. Carzaniga, A. Gorla, N. Perino and M. Pezze, “Automatic workarounds for web applications,” in Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering (FSE'10), Santa Fe, New Mexico, USA, 2010.
[26] A. Hindle, E. T. Barr, Z. Su, M. Gabel and P. Devanbu, “On the naturalness of software,” in 2012 34th International Conference on Software Engineering (ICSE), Zurich, Switzerland, 2012.
[27] A. Carzaniga, A. Gorla, A. Mattavelli, N. Perino and M. Pezze, “Automatic recovery from runtime failures,” in 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 2013.
[28] H. A. Nguyen, A. T. Nguyen, T. T. Nguyen, T. N. Nguyen and H. Rajan, “A study of repetitiveness of code changes in software evolution,” in 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), Silicon Valley, CA, USA, 2013.
[29] M. Martinez, W. Weimer and M. Monperrus, “Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches,” in Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion 2014), Hyderabad, India, 2014.
[30] E. T. Barr, Y. Brun, P. Devanbu, M. Harman and F. Sarro, “The plastic surgery hypothesis,” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014), Hong Kong, China, 2014.
[31] Y. Qi, X. Mao, Y. Lei and C. Wang, “Using automated program repair for evaluating the effectiveness of fault localization techniques,” in Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA2013), Lugano, Switzerland, 2013.
[32] M. White, M. Tufano, M. Martinez, M. Monperrus and D. Poshyvanyk, “Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities,” in 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), Hangzhou, China, 2019.
[33] H. Yokoyama, Y. Higo, K. Hotta, T. Ohta, K. Okano and S. Kusumoto, “Toward improving ability to repair bugs automatically: a patch candidate location mechanism using code similarity,” in Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC'16), Pisa, Italy, 2016.
[34] M. Martinez and M. Monperrus, “ASTOR: a program repair library for Java (demo),” in Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016), Saarbrücken, Germany, 2016.
[35] D. Kim, J. Nam, J. Song and S. Kim, “Automatic patch generation learned from human-written patches,” in 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 2013.
[36] F. Long and M. Rinard, “Staged program repair with condition synthesis,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015), Bergamo, Italy, 2015.
[37] Y. M. Chung, C. Y. Huang, and Y. C. Huang, “A Study of Modified Testing-Based Fault Localization Method,” in Proceedings of the 14th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2008), pp. 168-175, Taipei, Taiwan, Dec. 2008.
[38] C. T. Lin and C. Y. Huang, “Staffing Level Analysis of Software Debugging through Rate-Based Simulation Approaches,” IEEE Trans. on Reliability, Vol. 58, No. 4, pp. 711-724, Dec. 2009.
[39] Y. S. You, C. Y. Huang, K. L. Peng, and C. J. Hsu, “Evaluation and Analysis of Spectrum-Based Fault Localization with Modified Similarity Coefficients for Software Debugging,” in Proceedings of the 37th Annual IEEE International Computer Software and Applications Conference (COMPSAC 2013), pp. 180-189, Kyoto, Japan, July 2013.
[40] C. H. Lee, C. Y. Huang, and T. Y. Lin, "A Study of Applying Fault-Based Genetic-Like Programming Approaches to Automatic Software Fault Corrections," International Journal of Performability Engineering, Vol. 14, No. 9, pp. 2090-2104, Sept. 2018.
[41] Y. Mao, F. Boqin, Z. Li and L. Yao, “Neural Networks Based Automated Test Oracle for Software Testing,” in Proceedings of the 13th international conference on Neural information processing - Volume Part III (ICONIP'06), Hong Kong, China, 2006.
[42] B. L. Miller and D. E. Goldberg, “Genetic Algorithms, Selection Schemes, and the Varying Effects of Noise,” Evolutionary Computation, vol. 4, no. 2, pp. 113-131, 1996.
[43] A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, Heidelberg: Springer, 2003.
[44] H. Peng, L. Mou and G. Li, “Building Program Vector Representations for Deep Learning,” in International Conference on Knowledge Science, Engineering and Management (KSEM 2015), Chongqing, China, 2015.
[45] W. Weimer, T. Nguyen, C. L. Goues and S. Forrest, “Automatically finding patches using genetic programming,” in 2009 IEEE 31st International Conference on Software Engineering, Vancouver, BC, Canada, 2009.
[46] G. E. Hinton, S. Osindero and Y.-W. Teh, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
[47] G. C. Necula, S. McPeak, S. P. Rahul and W. Weimer, “CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs,” in Proceedings of the 11th International Conference on Compiler Construction (CC'02), Grenoble, France, 2002.
[48] S. Wang, T. Liu and L. Tan, “Automatically Learning Semantic Features for Defect Prediction,” in 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, USA, 2016.
[49] A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.
[50] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[51] J. Li, P. He, J. Zhu and M. R. Lyu, “Software Defect Prediction via Convolutional Neural Network,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, 2017.
[52] J. Yi, U. Z. Ahmed, A. Karkare, S. H. Tan and A. Roychoudhury, “A feasibility study of using automated program repair for introductory programming assignments,” in Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017), Paderborn, Germany, 2017.
[53] W. Weimer, Z. P. Fry and S. Forrest, "Leveraging program equivalence for adaptive program repair: models and first results," in Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE'13), Silicon Valley, CA, USA, 2013.
[54] E. Fast, C. L. Goues, S. Forrest and W. Weimer, "Designing better fitness functions for automated program repair," in Proceedings of the 12th annual conference on Genetic and evolutionary computation (GECCO '10), Portland, Oregon, USA, 2010.
[55] T. Jones and S. Forrest, "Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms," in Proceedings of the 6th International Conference on Genetic Algorithms, San Francisco, CA, USA, 1995.
[56] M. Xie and G. Y. Hong, "A study of the sensitivity of software release time," Journal of Systems and Software, vol. 44, no. 2, pp. 163-168, 1998.
[57] P. S. Yip, L. Xi, D. Y. Fong and Y. Hayakawa, "Sensitivity-analysis and estimating number-of-faults in removal debugging," IEEE Transactions on Reliability, vol. 48, no. 3, pp. 300-305, 1999.
[58] S. S. Gokhale and K. S. Trivedi, "Reliability prediction and sensitivity analysis based on software architecture," in Proceedings of the IEEE 13th International Symposium on Software Reliability Engineering (ISSRE 2002), Annapolis, MD, USA, 2002.
[59] A. Pasquini, A. N. Crespo and P. Matrella, "Sensitivity of reliability-growth models to operational profile errors vs. testing accuracy," IEEE Transactions on Reliability, vol. 45, no. 4, pp. 531-540, 1996.
[60] M. H. Chen, A. P. Mathur and V. Rego, "A case study to investigate sensitivity of reliability estimates to errors in operational profile," in Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering, Monterey, CA, USA, 1994.
[61] C. Y. Huang, J. H. Lo, J. W. Lin, C. C. Sue and C. T. Lin, "Optimal resource allocation and sensitivity analysis for modular software testing," in Proceedings of the IEEE Fifth International Symposium on Multimedia Software Engineering, Taichung, Taiwan, 2003.
[62] C. Y. Huang and M. R. Lyu, "Optimal Testing Resource Allocation, and Sensitivity Analysis in Software Development," IEEE Transactions on Reliability, vol. 54, no. 4, pp. 592-603, 2005.
[63] S. Elbaum, A. G. Malishevsky and G. Rothermel, “Prioritizing testcases for regression testing,” in Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis (ISSTA'00), Portland, Oregon, USA, 2000.
[64] C. Wohlin, P. Runeson, M. Host, M. C. Ohlsson, B. Regnell and A. Wessln, Experimentation in Software Engineering, Heidelberg: Springer, 2012.

簡易檢索 / 詳目顯示

相關論文