研究生: |
詹如雯 Zu-Wen Chan |
---|---|
論文名稱: |
以減裁法求學習函數 A Pruning Approach to Discovering the Learning Functions |
指導教授: |
王小璠
Hsiao-Fan Wang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 英文 |
論文頁數: | 58 |
中文關鍵詞: | 學習曲線 、資料探勘 、支撐向量機 、迴歸分析 |
外文關鍵詞: | Learning curves, Data Mining, Support Vector Machine, Regression Analysis |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
學習曲線由萊特在1936年所提出,此定律描述當勞工一直重複相同操作的時候,勞工會將操作學習或記憶起來,因此完成一個產品的總時間會以一定的比例降低。在本篇論文中,我們想要從一個擁有學習曲線特性的龐大的資料集合中求得一個擁有學習曲線特性的子集合,我們以支撐向量機(Support Vector Machine)的概念為基礎提出減裁法來達成我們的目的。在這個步驟中分為三個主要的階段,對於一個資料集合首先先定義出此集合中學習曲線所在的象限,然後我們計算其中心點的值且根據學習曲線的特性採用資料中心點的座標軸當作我們第一次的分離超平面(hyperplane),利用此分離超平面將此資料集合分成兩群,並將不需要的資料刪減掉。接著根據學習曲線不對稱的特性,將剩下的資料作迴歸分析找出適合的線性迴歸線配適,利用所配適的線性迴歸線當成我們第二次的分離超平面再次刪除不需要的資料。對於刪減後剩下的資料,在最小誤差的條件下,我們使用Nelder-Mead 簡算法來得到配適的學習曲線。接著我們利用殘差分析來評估所配適的函數,採用三種不同範圍的標準差來刪除不需要的資料,再利用Nelder-Mead簡算法來找出最佳配適的學習曲線,最後權量所刪除的資料量與正確性之間的關係。在此篇論文中我們以三個模式驗證所提出的減裁法之可行性。
In this thesis, we propose a pruning procedure base on the concept of Support Vector Machine(SVM)to discover a subset that possess the characteristics of some learning function form a large data set. In this procedure, three major stages of pruning are implemented for a given data set , we identify the quadrant of learning function firstly, then we calculate the central point, and take the coordinate axes of the central point to be our first separate hyperplane. Based on this separate hyperplane, we classify the data into two clusters and prune away unnecessary data. Second, based on these asymmetrical characteristics of the learning function, we apply regression analysis to obtain the regression line of the remaining data to be our second separate hyperplane and prune away unnecessary data. Under the condition of minimum error, the Nelder-Mead simplex method is used to obtain the learning function of the remaining data. At the third stage, we use the residual analysis to evaluate the resultant function. Three levels of standard deviation are used to prune unnecessary data. Then, we use the Nelder-Mead simplex method again to obtain the best-fit learning function. And the trade-off between accuracy and pruning data is carried out. Finally, we can obtain the best-fit learning function of the remaining data. Three models were used to demonstrate the proposed pruning procedure and the results are promising in application.
[1] Bigus J. P., Data Mining with Neural Networks:Solving Business Problems from Application Development Support, New York, The McGraw-Hill Companies, Inc. 1996.
[2] Carlson J. G., How Management Can Use the Improvement Phenomenon, California Management Review, Vol.3, No. 2, Winter 1961, pp. 83-94.
[3] Carrillo M., J. M. Gonzalez, A New Approach to Modeling Sigmoidal Curves, Technological Forecasting and Social Change, Vol. 69, 2002, pp. 233-241.
[4] Chand S., Lot Sizes and Setup Frequency with Learning in Setups and Process Quality, European Journal of Operational Research, Vol. 42, 1989, pp.190-202.
[5] Chand S., and S. P. Sethi, A Dynamic Lot Sizing Model with Learning in Setups, Operations Research, Vol. 38, 1990, pp.644-655.
[6] Chen W. H., S. H. Hsu, H. P. Shen, Application of SVM and ANN for Intrusion Detection, Computers and Operations Research, Vol. 32, No. 10 , October 2005, pp.2617-2634.
[7] Fausett L., Fundamentals of Neural Networks, Prentice-Hall International, 1994.
[8] Frawley W. J., G. Piatetsky-Shapiro, and C.J. Matheus, Knowledge Discovery in Databases:An Overview, AI Magazine, Fall, 1992, pp. 57-69.
[9] Freeman J. A. and D. M Skapura., Neural Networks: Algorithms, Applications, and Programming Techniques, Mass., Addison-Wesley Publishing Company, 1992.
[10] Hoehler F. K., Logistic Equations in The Analysis of S-shaped Curves, Computers in Biology and Medicine, Vol. 25. No. 3, May 1995, pp. 367-371.
[11] Haykin S., Neural Networks: A Comprehensive Foundation, 2nd Ed., New Jersey, Prentice Hall, 1999.
[12] Jaber Y. M., S. Sikstrom, A Numerical Comparison of Three Potential and Forgetting Models, International Journal of Production Economics, Vol. 92, 2004, pp.281-294.
[13] Kleinbaum D., L. Kupper, K. Muller, Applied Regression Analysis and Other Multivariate Methods, 2nd Ed., Boston, MA:Pws-Kent, 1988.
[14] Kohonen T., The Self-Organizing Map, Proceedings of the IEEE, Vol. 78, No. 9, Sept. 1990b, pp.1464-1480.
[15] Lagarias J. C., J. A. Reeds, M. H. Wright, and P. E. Wright, Convergence Properties of The Nelder-Mead Simplex Method in Low Dimensions, SIAM Journal on Optimization, Vol. 9, No. 1, 1998, pp.112-147.
[16] Nelder J. A., R. Mead, A Simplex Method for Function Minimization, Computer Journal, Vol. 7, 1965, pp308-313.
[17] Neter J., M. H. Kunter, C. J. Nachtsheim, W. Wasserman, Applied Linear Regression Models, 3rd Ed., New York, The McGraw-Hill Companies, Inc., 1999.
[18] Ratkowsky D. A., Nonlinear Regression Modeling:A Unified Practical Approach, New York, Marcel Dekker, 1983.
[19] Spence A. M., The Learning Curve and Competition, Bell Journal of Economics, Vol. 12, No. 1, 1981, pp. 49-70.
[20] Vapnik V. N., The Nature of Statistical Learning Theory, New York:Springer-Verlag, 1995.
[21] Wang S., Nonlinear Regression:A Hybrid Model, Computers and Operations Research, vol. 26, no. 8, July 1999, pp. 799-817.
[22] Witten I. H., E. Frank, Data Mining:Practical Machine Learning Tools and Techniques with Java Implementions, San Francisco, Calif.:Morgan Kaufmann Publishers, 2000.
[23] Wright T. P., Factors Affecting the Cost of Airplanes, Journal of Aeronautical Science 3, February 1936, pp. 122-128.
[24] Yelle L. E., The Learning Curve:Historical Review and Comprehensive Survey, Decision Sciences 10, 1979, pp. 302-328.
[25] http://math.fullerton.edu/mathews/n2003/NelderMeadProof.html
[26] http://mathworld.wolfram.com/AlgebraicFunction.html
[27] http://sysdyn.clexchange.org/sdep/Roadmaps/RM6/D-4476-2.pdf#search='Exploring%20Sshaped%20growth'
[28] http://www.lions.odu.edu/~runal/enma717/mertlc.ppt