簡易檢索 / 詳目顯示

研究生: 黃詩閔
Huang, Shih-Min
論文名稱: 運用模型組合方法於開放源碼軟體規模變動之分析
Using Model Combinatorial Methods to Analyze Program Size Variation in Open-Source Software
指導教授: 黃慶育
Huang, Chin-Yu
口試委員: 林振緯
蘇銓清
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 73
中文關鍵詞: 軟體大小估計程式碼長度組合模型分量迴歸
外文關鍵詞: Size estimation, Code length, Combination Model, Quantile Regression
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在軟體開發的過程中,軟體大小的測量是一個非常重要且會影響專案是否會成功的關鍵因素之一。一般來說,軟體大小的衡量可分為,軟體行碼數,功能點的多寡,以及軟體複雜度。在實務上,因為軟體行碼數的易於計算和廣泛使用於各種軟體測量領域,大部分的軟體開發工程師在專案完成時還是以軟體行碼數當作測量軟體大小的尺度。在過去有一些研究發現指出,Lognormal distribution 和 Double-pareto distribution 模型均可以用來描述專案裡軟體大小分佈情形。然而本文切換另外一種角度探討軟體行碼數的變動率的分佈情形,我們觀察到軟體規模的變動率和經濟學的利率變動情形有相似的地方,所以我們使用衡量利率變動情形的Laplace distribution,Normal distribution 和 Asymmetric laplace distribution來衡量軟體大小的變動率。
      目前許多的研究都分別指出沒有最好的模型適用在所有的情況,我們利用一種基於貝氏定理權重決定法(BIWDA)為修改的模型組合方法: Modified BIWDA(MBIWDA)來塑造軟體大小變動率分佈。研究資料來源於Apache,Ubuntu和Samba Server來確認利用MBIWDA的組合模型的軟體大小變動率分佈之預測能力。與其他相關模型組合方法比較後,本研究結果指出利用MBIWDA的組合模型能夠在大部份情況下提供不錯的預測能力。並且,我們也利用分量迴歸分析來探討影響軟體大小的變動因素,研究發現在軟體早期階段軟體大小的變動會被嚴重等級較高的錯誤所影響。這些發現對於分析開放源碼的軟體大小變動率都有很大的啟發。


    As one of the most important internal attributes of software systems, the estimation of software size is crucial to project success. Typically, software size can be described by the length, functionality, or complexity of the file, but in practice many people still use lines of code (LOC) as a measure of software size since LOC is widely used and can be easily measured upon project completion.
    In this paper, we used a linear combination model with modified Bayesian inference weight decision approach (MBIWDA) to analyze the size distribution and software size-change rate of Open-Source Software (OSS). Furthermore, we investigated the factors that influenced the software size-change rate using the quantile regression (QR) model. Experiments were conducted using real data of several OSS projects, and evaluation results showed that the linear combination model with the MBIWDA had a outstanding capability of fitting the distribution of the software size-change rate. Finally, the analysis of QR demonstrated that faults with higher severity had an impact on LOC changes in the early stage. These findings offer an alternative view and reveal different issues of software sizing.

    Contents Abstract (in Chinese) i Abstract ii Acknowledgement iii Contents iv List of Tables vi List of Figures viii Chapter 1 1 Introduction 1 Chapter 2 5 Related Works 5 2.1 Software Size Estimation 5 2.2 Model Combination 12 Chapter 3 15 Methods of Combinatorial model 15 3.1 Equal Weight Combination 15 3.2 Modified BIWDA Linear Combination Model 16 Chapter 4 21 Experiments and Data Analysis 21 4.1 Data Collection Procedure 21 4.2 Parameter Estimation 24 4.3 Evaluation Criteria 28 4.4 Performance Comparison and Assessment 35 4.4.1 Apache HTTP Server (DS1) 36 4.4.2 Ubuntu (DS2) 42 4.4.3 Samba Server (DS3) 51 Chapter 5 58 Application 58 5.1 Software Project Management 58 5.2 Sensitivity Analysis 63 Chapter 6 68 Conclusions and Future Work 68 References 70

    [1]. R. Sharma and R. S. Chhillar, “Novel Approach to Software Metrics,” International Journal of Soft Computing and Engineering, Vol 2, Issue. 3 Jul 2012
    [2]. I. Herraiz, D. M. German and A. E. Hassan, “On the Distribution of Source Code File Sizes,” Proceedings of the 6th International Conference on Software and Data Technologies (ICSOFT 2011) pp.18-21, Seville, Spain, July 2011.
    [3]. G. Concas, M. Marchesi, S. Pinna, and N. Serra, “Power-laws in a Large Oriented Software System,” IEEE Trans. on Software Engineering, Vol. 33, Issue. 10, pp. 687-708, Oct 2007.
    [4]. D. Amit, and D. Riehle, “The Total Growth of Open Source,” Open Source Development, Communities and Quality, pp. 197-209, Springer US, 2008.
    [5]. K. Samuel, T. J. Kozubowski, and K. Podgorski, The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance, No. 183. Springer, 2001.
    [6]. S. Gennady, and M. S. Taqqu, “Stable Non-Gaussian Random Processes,” Econometric Theory 13 pp. 133-142, 1997.
    [7]. C. J. Hsu and C. Y. Huang, “Reliability Analysis Using Weighted Combinational Models for Web-based Software,” Proceedings of the 18th International Conference on World Wide Web, pp. 1131-1132. ACM, New York, USA, Apr 2009.
    [8]. M. Mitzenmacher, “A Brief History of Generative Models for Power Law and Lognormal Distributions,” Internet Mathematics Vol.1, No. 2, pp. 226-251, 2004.
    [9]. R. Cooper, and T. J. Weekes, Data, Models, and Statistical Analysis, Rowman & Littlefield, 1983.
    [10]. S. L. Pfleeger, F. Wu, and R. Lewis, Software Cost Estimation and Sizing Methods: Issues and Guidelines, Vol. 269. Rand Corporation, 2005.
    [11]. B.W. Boehm, Software Engineering Economics, Pearson Education, 1981.
    [12]. G. E. Klein, “The Sensitivity of Cash-flow Analysis to the Choice of Statistical Model for Interest Rate Changes,” Insurance: Mathematics and Economics Vol. 16, No. 2, pp.187-187, 1995.
    [13]. T. J. Kozubowski, and K. Podgorski, “A Class of Asymmetric Distributions,” Actuarial Research Clearing House, Vol. 1, pp. 113-134, 1999.
    [14]. H. Okamura, T. Dohi, and S. Osaki, “Software Reliability Growth Model with Normal Distribution and its Parameter Estimation,” Proceedings of the IEEE International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering (ICQR2MSE 2011), pp.411-416, Xi’an, China, June 2011.
    [15]. G. Bottazzi, and A. Secchi, “Why are Distributions of Firm Growth Rates Tent-shaped?,” Economics Letters, Vol. 80, Issue. 3, pp. 415-420, Sep 2003.
    [16]. M. R. Lyu, Handbook of Software Reliability Engineering, McGraw Hill, 1996.
    [17]. M. R. Lyu, and A. Nikora, “A Heuristic Approach for Software Reliability Prediction: the Equally-weighted Linear Combination Model,” Proceedings of Software Reliability Engineering, 1991 International Symposium on IEEE, Austin, TX, USA, May 1991.
    [18]. M. R. Lyu, and A. Nikora, “Applying Reliability Models More Effectively (software),” Software, IEEE, Vol. 9, Issue.4, pp. 43-52, July 1992.
    [19]. Y. S Su, and C. Y. Huang, “Neural-network-based Approaches for Software Reliability Estimation Using Dynamic Weighted Combinational Models,” Journal of Systems and Software Vol. 80, Issue. 4, pp.606-615, Apr 2007.
    [20]. H. Li, M. Zeng and M Lu, “Adaboosting‐based Dynamic Weighted Combination of Software Reliability Growth Models,” Quality and Reliability Engineering International, Vol. 28, Issue. 1, pp.67-84, 2012.
    [21]. M. Lu, and S. Brocklehurst, “Combination of Predictions Obtained From Different Software Reliability Growth Models,” 1991.
    [22]. G.E.P. Box, and G.C. Tiao, Bayesian inference in statistical analysis, Vol. 40. John Wiley & Sons, 2011.
    [23]. P. S. Bullen, Handbook of Means and their Inequalities, Springer, 2003.
    [24]. http //www.ohloh.net/, accessed 21 March 2013
    [25]. https //issues.apache.org/bugzilla/, accessed 1 January 2013
    [26]. T. ÖZTEKIN, “Comparison of Parameter Estimation Methods for the Three-parameter Generalized Pareto Distribution,” Turkish Journal of Agriculture and Forestry, Vol.29 No.6, pp. 419-428, 2005.
    [27]. J. D. Musa, A. Iannino, and K. Okumoto, Software Reliability, New York: McGraw-Hill, 1987.
    [28]. S. D. Conte, H. E. Dunsmore, V. Y. Shen, Software Engineering Metrics and Models, 1986
    [29]. P. Rook, Software Reliability Handbook, Elsevier Science Inc., New York, NY, USA, 1990.
    [30]. N. F. Schneidewind, “Finding the Optimal Parameters for a Software Reliability Model,” Innovations in Systems and Software Engineering Vol.3 Issue.4, pp. 319-332, Dec 2007.
    [31]. K. Shibata, K. Rinsaka, and T. Dohi, “PISRAT: Proportional Intensity-based Software Reliability Assessment Tool,” Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007), pp. 43-52, Melbourne, Victoria, Australia, Dec 2007.
    [32]. M. Xie, Q. P. Hu, Y. P. Wu, and S. H. Ng, “A Study of the Modeling and Analysis of Software Fault‐detection and Fault‐correction Processes,” Quality and Reliability Engineering International Vol.23, Issue.4, pp. 459-470, Dec 2007.
    [33]. R.J. Hyndman, and A.B. Koehler. “Another Look at Measures of Forecast Accuracy,” International Journal of Forecasting, Vol. 22, No. 4, pp.679-688, 2006.
    [34]. D.L. Waller, and D. L. Waller. Operations Management: a Supply Chain Approach. Vol. 841. London, UK: International Thomson Business Press, 1999.
    [35]. J.S. Armstrong, and F. Collopy, “Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons,” International Journal of Forecasting, Vol.8, No.1, pp. 69-80, 1992.
    [36]. E.A. Edward, and S. Nalampang, “Forecasting Price Trends in the US Avocado (Persea americana S. Avocado (Mill.) Market Mill,” Journal of Food Distribution Research Vol.40 No.2, pp.38, 2009.
    [37]. G.Keller and B. Warrack, Statistics for Management and Economics, Duxbury, 1999.
    [38]. Y. F. Li, M. Xie, and T. N. Goh, “A Study of the Non-linear Adjustment for Analogy based Software Cost Estimation,” Empirical Software Engineering, Vol. 14, Issue 6, pp. 603-643, Dec. 2009.
    [39]. C. T. Lin, and C. Y. Huang, “Enhancing and Measuring the Predictive Capabilities of Testing-effort Dependent Software Reliability Models,” Journal of Systems and Software Vol. 81, Issue.6, pp.1025-1038, Jun 2008.
    [40]. M. Zhao, and M. Xie, “On the Log-power NHPP Software Reliability Model,” Proceedings of the 3rd IEEE International Symposium on Software Reliability Engineering, Research Triangle Prak, North Carolina, pp.14-22, Oct 1992.
    [41]. H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Trans. on Automatic Control, Vol.19 Issue.6, pp.716-723, Dec 1974.
    [42]. A. L. Goel, “Software Reliability Modeling and Estimation Technique,” RADC-TR-82-263, 1982.
    [43]. W.J. Conover, Practical Nonparametric Statistics, John Wiley and Sons, 1980.
    [44]. G. Triantafyllos, and S. Vassiliadis, “Software Reliability Models for Computer Implementations—an Empirical Study.” Software: Practice and Experience Vol.26, Issue.2, pp.135-164 Feb 1996.
    [45]. Greenwood, P.E., Nikulin, M.S. A Guide to Chi-squared Testing. Wiley, New York. ISBN 0-471-55779-X 1996.
    [46]. S. Amiri, D. Von Rosen, and S. Zwanzig, On the Comparison of Parametric and Nonparametric Bootstrap, Department of Mathematics, Uppsala University, 2008.
    [47]. S.R.A Fisher, Statistical Methods for Research Workers, Vol. 14. Edinburgh: Oliver and Boyd, 1970.
    [48]. R. Koenker, and G. Bassett, “Regression Quantiles,” Econometrica: Journal of the Econometric Society pp. 33-50, 1978.
    [49]. J. Yang, X. Meng, an d M. W. Mahoney, “Quantile Regression for Large-scale Applications”, 2013.
    [50]. V.R. Basili, and B.T. Perricone. “Software Errors and Complexity: an Empirical Investigation,” Communications of the ACM Vol.27 No.1, pp. 42-52. 1984.
    [51]. N.E. Fenton, and N. Ohlsson. “Quantitative Analysis of Faults and Failures in a Complex Software System,” IEEE Trans. on Software Engineering, Vol.26 Issue.8, pp.797-814, Aug 2000.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE