簡易檢索 / 詳目顯示

研究生: 欒紹蒲
Luan, Shao-Pu
論文名稱: 運用具單變換點雙參數廣義柏拉圖模型於開放源碼軟體之錯誤分佈分析
Applying 2-Parameter Generalized Pareto Model with Single Change-Point to Analyze the Fault Distribution of Open Source Software
指導教授: 黃慶育
Huang, Chin-Yu
口試委員: 蘇銓清
Sue, Chuan-Ching
林振緯
Lin, Jenn-Wei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 70
中文關鍵詞: 開放源碼軟體錯誤分佈柏拉圖法則韋伯分佈模型變換點
外文關鍵詞: Open source software, fault distribution, Pareto principle, Weibull distribution model, change-point
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 普遍性同意開放源碼軟體在現代社會上日益扮演一個重要的角色。儘管文獻已經廣泛探討過開放源碼軟體的可靠度議題,但是對開放源碼軟體的錯誤分佈研究則相對缺乏。在過去有一些研究發現指出Pareto principle,traditionally-used Pareto distribution以及Weibull distribution模型均可以用來描述軟體的錯誤分佈;然而在我們前面的研究中發現,一個與Pareto principle相關的2-parameter generalized Pareto distribution (2-GPD) 模型能更有益於塑造軟體的錯誤分佈。
    本文探討一種基於2-GPD為修改的模型:2-parameter generalized single change-point Pareto distribution (SCP-2GPD),其變換點的選擇與Pareto principle有相當大的關係。此研究聚焦在塑造開放源碼軟體的錯誤分佈,並且呈現一些SCP-2GPD模型的數學性質。研究資料來源基於Apache以及Mozilla均被施行 (勘查於名為Bugzilla的開放源碼軟體錯誤資料庫),以確認SCP-2GPD模型的錯誤分佈之預測能力。與其他相關錯誤分佈模型比較後,本研究結果指出提議的SCP-2GPD模型在開放源碼軟體的錯誤分佈具有相當準確之預測能力。這些發現對於分析真實各種開放源碼軟體的錯誤分佈都有很大的啟發。


    There is a general agreement that open source software (OSS) plays an increasingly critical role in the modern society. In the past, some research findings showed that the Pareto principle, the traditionally-used Pareto distribution (PD) and the Weibull distribution (WD) models would be able to describe the distribution of software fault; a Pareto principle related 2-parameter generalized Pareto distribution (2-GPD), however, could be more useful to model the distribution of software faults in our previous study. This paper studies a modification of the 2-GPD model called the 2-parameter generalized single change-point Pareto distribution (SCP-2GPD) model, and the selection of change-point is highly pertinent to the Pareto principle. The research focuses on modeling the distribution of OSS faults, and some mathematical properties of the SCP-2GPD model are presented. Sources of data based on Apache and Mozilla found in bug database of OSS called Bugzilla are performed in order to ascertain the prediction capability of fault distribution for the SCP-2GPD model. Compared with other fault distribution models, the findings suggest that the proposed SCP-2GPD model has a fairly accurate prediction capability of fault distribution of OSS. These findings have implications for analyzing the fault distribution of real-life-various OSS.

    Abstract in Chinese i Abstract ii Acknowledgement iii List of Tables vi List of Figures vii Chapter 1 1 Introduction 1 Chapter 2 5 Motivations 5 2.1 Related Works 5 2.2 Concept of Component-Based Change-Point 10 Chapter 3 14 2-Parameter GPD Model with a Single Change Point 14 3.1 Model Description 14 3.2 Mathematical Properties 16 A. The Characterization of Failure Rate Function 16 B. The Moment Estimator of SCP-2GPD 18 C. Relationship to Other Distributions 19 Chapter 4 22 Experiments and Data Analysis 22 4.1 Data Collection Procedure 22 4.2 Parameters Estimation 27 4.2.1 Least Square Estimation for SCP-2GPD 28 4.2.2 Piecewise Maximum Likelihood Estimation for SCP-2GPD: 29 4.3 Evaluation Criteria 32 4.4 Performance Comparison and Assessment 36 4.4.1 Resource 41 Chapter 5 51 Discussion and Conclusions 51 Appendix 54 Sources of OSS failure data 54 A. Apache Ant 1.6.x 54 B. Apache HTTP Server 2.x 56 C. Mozilla Firefox 2.x and 3.x 61 D. Mozilla Thunderbird 2.x 64 References 67

    [1] T. J. Ostrand, E. J. Weyuker, and R. M. Bell, “Predicting the Location and Number of Faults in Large Software Systems,” IEEE Trans. on Software Engineering, Vol. 31, No. 4, pp. 340-355, April 2005.
    [2] J. D. Musa, A. Iannino, and K. Okumoto, Software Reliability, Measurement, Prediction and Application, McGraw-Hill, 1987.
    [3] H. Zhang “On the Distribution of Software Faults,” IEEE Trans. on Software Engineering, Vol. 34, No. 2, pp. 301-302, March/April 2008.
    [4] M. R. Lyu, Handbook of Software Reliability Engineering, McGraw Hill, 1996.
    [5] C. Y. Huang, M. R. Lyu, and S. Y. Kuo, “A Unified Scheme of Some Non-Homogenous Poisson Process Models for Software Reliability Estimation,” IEEE Trans. on Software Engineering, Vol. 29, No. 3, pp. 261-269, March 2003.
    [6] Y. Zhou and J. Davis, “Open source software reliability model: an empirical approach”, Proceedings of the 5th Workshop on Open Source Software Engineering, pp. 1-6, May 2005.
    [7] C. Rahmani, A. Azadmanesh, and L. Najjar. A comparative analysis of open source software reliability. Journal of Software, 5:1384-1394, December 2010.
    [8] C. Y. Huang and C. T. Lin, “Analysis of Software Reliability Modeling Considering Testing Compression Factor and Failure-to-Fault Relationship,” IEEE Trans. on Computers, Vol. 59, No. 2, pp. 283-288, Feb. 2010.
    [9] T. Fujii, T. Dohi, and T. Fujiwara, “Towards Quantitative Software Reliability Assessment in Incremental Development Processes,” Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011), pp. 41-50, Honolulu, Hawaii, May 2011.
    [10] J. D. Musa and K. Okumoto, “A Logarithmic Poisson Execution Time Model for Software Reliability Measurement,” Proceedings of the 7th International Conference on Software Engineering (ICSE’84), pp. 230-238, Orlando, Florida, USA, March 1984.
    [11] Y. P. Wu, Q. P. Hu, M. Xie, S. H. Ng, “Modeling and Analysis of Software Fault Detection and Correction Process by Considering Time Dependency,” IEEE Trans. on Reliability, Vol. 56, No. 4, pp. 629-642, 2007.
    [12] M. R. Lyu, “Software reliability engineering: A roadmap,” Proceedings of the 29th International Conference on Software Engineering (ICSE 2007), Future of Software Engineering, pp. 153-170, Minneapolis, MN, May 2007.
    [13] Gokhale S., Architecture-Based Software Reliability Analysis: Overview and Limitations. IEEE Trans. on Dependable and Secure Computing: 4(1), Jan 2007.
    [14] C. S. Kuo and C. Y. Huang, “A Study of Applying the Bounded Generalized Pareto Function to the Analysis of Software Fault Distribution,” Proceedings of 2010 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM 2010), pp. 611-615, Macau, China, Dec. 2010.
    [15] C. S. Kuo, C. Y. Huang, and S. P. Luan, “A Study of Using 2-parameter Generalized Pareto Model to Analyze the Fault Distribution of Open Source Software”, accepted and to appear in The Sixth IEEE International Conference on Software Security and Reliability (SERE 2012), Washington, D.C., USA, June 2012.
    [16] N. Fenton and N. Ohlsson, “Quantitative Analysis of Faults and Failures in a Complex Software System,” IEEE Trans. on Software Engineering, Vol. 26, No. 8, pp. 797-814, Aug. 2000.
    [17] C. Andersson and P. Runeson, “A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems,” IEEE Trans. on Software Engineering, Vol. 33, No. 5, pp. 273-286, May 2007.
    [18] R. Cooper and A. Weekes, Data, Models, and Statistical Analysis, Philip Allan Publishing, 1983.
    [19] C. D. Lai, M. Xie, Stochastic Ageing and Dependence for Reliability, Springer, New York, NY, USA, 2006.
    [20] D. N. P. Murthy, M. Xie, R. Jiang, Weibull Models, John Wiley & Sons, Hoboken, New Jersey, 2003.
    [21] C. D. Lai, M. Xie, D. N. P. Murthy, “A Modified Weibull Distribution,” IEEE Trans. on Reliability, Vol. 52, No. 1, pp. 33-37, March 2003.
    [22] C. T. Lin, C. Y. Huang, and J. R. Jang, “Integrating Generalized Weibull Testing-Effort Function and Multiple Change-Points into Software Reliability Growth Models,” Proceedings of the 12th Asia-Pacific Software Engineering Conference (APSEC 2005), pp. 431-438, Dec. 2005, Taipei, Taiwan.
    [23] J. Pickands., “Statistical Inference Using Extreme Order Statistics,” The Annals of Statistics, Vol. 3, No. 1, pp. 119-131, 1975.
    [24] J. R. M. Hosking and J. R. Wallis, “Parameter and Quantile Estimation for the Generalized Pareto Distribution,” Technometrics, Vol. 29, Issue 3, pp. 339-349, August 1987.
    [25] The Mozilla Organization. Bugzilla Bug Tracking System, 1998-2012. http://www.bugzilla.org [24 April 2012].
    [26] M. Zhao, “Statistical reliability change-point estimation models,” in Handbook of Reliability Engineering. New York: McGraw-Hill, pp. 157-163, 2003.
    [27] C. Y. Huang and M. R. Lyu, “Estimation and Analysis of Some Generalized Multiple Change-Point Software Reliability Models,” IEEE Trans. on Reliability, Vol. 60, No. 2, pp. 498-514, June 2011.
    [28] N. L. Johnson and S. Kotz, Distributions in Statistics—Continuous Univariate Distributions, Vol. 1, New York: John Wiley & Sons, 1970.
    [29] J. Castillo, J. Daoudi, “Estimation of the Generalized Pareto Distribution,” Statistics & Probability Letters, Vol. 79, issue 5, pp. 684-688, March 2009.
    [30] E. Castillo and A. S. Hadi, “Fitting the Generalized Pareto Distribution to Data,” Journal of the American Statistical Association, Vol. 92, No. 440, pp. 1609-1620, Dec. 1997.
    [31] http://issues.apache.org/bugzilla/, Accessed on April, 2012.
    [32] https://bugzilla.mozilla.org/, Accessed on 24 April, 2012.
    [33] A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11(3):309-346, 2002.
    [34] M. Rytgaard, “Estimation in the Pareto Distribution”, ASTIN Bulletin, Vol. 20, No. 2, pp. 201-216, 1990.
    [35] M.A. Al-Fawzan, Methods for Estimating Parameters of the Weibull Distribution, King Abdulaziz city for science and technology, P.O. Box 6086, Riyadh 11442, Saudi Arabia, 2000.
    [36] G. C. Stone, and G. V. Heeswijk, “Parameter Estimation for the Weibull Distribution,” IEEE Trans. on Electrical. Insulation, Vol. EI-12, No. 4, pp. 253-261, August 1977.
    [37] A. C. Cohen, “Maximum Likelihood Estimation in the Weibull Distribution based on Complete and on Censored Samples,” Technometrics, Vol. 7, No. 4, pp. 579-588, Nov. 1965.
    [38] A. C. Davison and R. L. Smith, “Models for the Exceedances over High Thresholds”, Journal of the Royal Statistical Society. Series B, Vol. 52, No. 3, pp. 393-442, 1990.
    [39] M. Xie, Software Reliability Modeling: World Scientific Publishing Company, 1991.
    [40] T. Zhang, M. Xie, “On the Upper Truncated Weibull Distribution and Its Reliability Implications,” Reliability Engineering and System Safety, Vol. 96, Issue 1, pp. 194-200, Jan. 2011.
    [41] T. Oztekin, “Comparison of Parameter Estimation Methods for the Three-Parameter Generalized Pareto Distribution”, Turkish Journal of Agriculture and Forestry, Vol. 29, No. 6, pp. 419-428, 1994.
    [42] G. Keller and B. Warrack, Statistics for Management and Economics, Duxbury, 1999.
    [43] S. D. Conte, H. E. Dunsmore, and V. Y. Shen, Software Engineering Metrics and Models, Benjamin-Cummings Publishing Co., Inc., Redwood City, CA, 1986.
    [44] P. Rook, Software Reliability Handbook, Elsevier Science Inc., New York, NY, USA, 1990.
    [45] N. F. Schneidewind, “Finding the Optimal Parameters for a Software Reliability Model,” Innovations in Systems and Software Engineering, Vol. 3, pp. 319-332, 2007.
    [46] M. Xie, Q. P. Hu, Y. P. Wu, and S. H. Ng, “A Study of the Modeling and Analysis of Software Fault-Detection and Fault-Correction Processes, “Quality and Reliability Engineering International, Vol. 23, Issue 4, pp. 459-470, June 2007.
    [47] K. Shibata, K. Rinsaka, T. Dohi, “PISRAT: Proportional Intensity-Based Software Reliability Assessment Tool,” prdc, Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007), pp. 43-52, Melbourne, Victoria, Australia, Dec. 2007.
    [48] Y. F. Li, M. Xie, and T. N. Goh, “A Study of the Non-linear Adjustment for Analogy based Software Cost Estimation,” Empirical Software Engineering, Vol. 14, Issue 6, pp. 603-643, Dec. 2009.
    [49] A. L. Goel, “Software Reliability Modelling and Estimation Techniques’ RADC-TR-82-263, 1982.
    [50] W. J. Conover, Practical Non Parametric Statistics, John Wiley and Sons, 1980.
    [51] G. Triantafyllos and S. Vassiliadis, “Software Reliability Models for Computer Implementations-An Empirical Study,” Software: Practice and Experience, Vol. 26, No. 2, pp. 135-164, Feb. 1996.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE