研究生: |
楊景晴 Jing-Ching Yang |
---|---|
論文名稱: |
整合決策樹與關聯規則之資料挖礦架構及其實證研究 A Data Mining Framework with Decision Tree and Association Rules and Two Empirical Studies |
指導教授: |
簡禎富 教授
Chen-Fu Chien |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management |
論文出版年: | 2004 |
畢業學年度: | 92 |
語文別: | 中文 |
論文頁數: | 89頁 |
中文關鍵詞: | 資料挖礦 、決策樹 、關聯規則 、事故排除 、決策分析 |
外文關鍵詞: | Data Mining, Decision Tree, Association Rule, Troubleshooting, Decision Analysis |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著資訊科技進步與電腦的普及,企業逐漸開始建構屬於自己的資料庫,資料亦被大量的儲存與紀錄下來。企業擁有更多的資料來獲取更準確的答案,但也由於資料的大量與複雜度,亦增加使用者萃取資訊的困難度,降低其資料的價值,而資料挖礦可從大量資料中以自動或是半自動的方式來探索(explore)和分析資料,以發掘出潛在有用的資訊。在現代企業製商整合與電子化之經營環境下,決策者可以應用資料挖礦技術萃取出有價值的資訊或原本隱藏不知的特殊樣型,以處理大量資料混雜時的決策問題。本研究目的在於整合決策樹與關聯規則之方法,架構一適用於一般性事故排除問題的資料挖礦架構,並以台灣電力公司配電事故表與某半導體廠實際資料與進行實證研究,以驗證此架構之效度。
As the progress of IT and computer science, the companies begin to build their own database to store and record the data and to obtain more exact benefits. On the other hand, owing to the complexity, it is more difficult to obtain the information they want that lower the value of the data. Data mining can explore useful information and analyze it from a great deal of data in automative and semi-automative ways. In morden business environment of e-Manufacturing and e-Commerce, the decision maker can make decision based on valuable information and hidden pattern from data mining methods. This research aims to integrate the methodology of Decision Tree and Association Rule and build a architecture of data mining which is suitable for common troubleshooting problem-solving event. Finally, we have the real case studies on distribution feeder faults of Taiwan Power Company and actual engineering data of one semiconducter foundry. The result shows practical viability of data mining approach for troubleshooting.
王文志(2003),「實驗設計為基礎架構之資料挖礦方法及其實證研究」,國立清華大學工業工程與工程管理學研究所碩士論文。
李培瑞(2002),「半導體製程資料挖礦架構、決策樹分類法則及其實證研究」,國立清華大學工業工程與工程管理學研究所碩士論文。
林鼎浩(2000),「建構半導體製程資料挖礦架及其實證研究」,國立清華大學工業工程與工程管理學研究所碩士論文。
吳鴻志(2002),「運用關連式法則分析異常WAT良率之機台組合」,中正大學會計學研究所碩士論文。
洪紹鯤(1999),「半導體製程之資料探勘」,台灣科技大學電機工程研究所碩士論文。
孫天龍、蔡智政、羅國弘、劉泰興、賈方霈、廖建華(2002),「資料探勘技術於電子產業製程資料分析─以IC與PCB為例」,2002 PCB製造與管理技術研討會論文集,196-174頁。
陳彥良、凌俊青、許秉瑜(2001),「在包裹式資料庫中挖掘數量關連規則」,資訊管理學報,第七卷,第二期,215-229頁。
陳建銘(2001),「類神經網路於Web Mining之應用」,台北科技大學商業自動化與管理研究所碩士論文。
陳麗君(2003),「應用資料探勘技術於信用卡黃金級客戶之顧客關係管理」,元智大學工業工程與管理學系碩士論文。
劉中光(2002),「以資料挖掘為基建構製程品質問題診斷系統--以印刷電路板業為例」,元智大學工業工程與管理學系碩士論文。
陳順宇(1998),多變量分析,華泰,台北。
莊達人(1999),VLSI製造技術,高立,台北。
葉忠、吳恆睿(2002),「中醫院揀藥作業儲位規劃之研究」,運籌研究期刊,第二期,39-84頁。
蔡智政(2002),「應用CART決策樹與資料視覺技術於低良率晶圓成因探討」,元智大學工業工程與管理研究所碩士論文,頁40。
鄭仁傑(2003),「以混合決策樹方法分析有相互關係之半導體製造資料」,國立清華大學工業工程與工程管理學研究所碩士論文,頁67。
簡禎富、王鴻儒、徐紹鐘、李培瑞(2002),「決策樹資料挖礦架構及其於半導體製程之實證研究」,科技管理學刊,第七卷,第一期,137-160頁。
簡禎富、徐紹鐘、彭誠湧、林鼎浩(1999),「建構半導體製程事故資料挖礦方法及其實證研究」,中國工業工程學會88年度年會論文集,第114頁
簡禎富、林鼎浩、徐紹鐘、彭誠湧(2001),「建構半導體晶圓允收測試資料挖礦架構及其實證研究」,工業工程學刊,第十八卷,第四期,37-48頁。
簡禎富、徐紹鐘、彭誠湧、林鼎浩(2000),「建構晶圓圖分類之資料挖礦方法及其實證研究」,國科會工程處工業工程學門決策分析方法與應用研討會論文集,第439-458頁。
簡禎富、李培瑞(2001),「現代決策工具:資料挖礦及其在半導體製程資料特徵萃取與事故分析之實證」,中華決策科學學會年會暨論文研討會論文集,77-84頁。
簡禎富、李培瑞(2001),「半導體製程資料分群、特徵萃取與資料挖礦」,中華民國科技管理研討會論文集,396-401頁。
簡禎富、王鴻儒、徐紹鐘、李培瑞(2002),「決策樹資料挖礦架構及其於半導體製程之實證研究」,科技管理學刊,第七卷,第一期,137-160頁。
簡禎富、彭金堂、林怡傑、楊景晴(即將刊登),「建構關聯規則資料挖礦方法及其在台電配電事故定位之研究」,資訊管理學報。
顏月珠(2001),無母數統計方法,三民書局,台北。
Agrawal, R., Imielinsld, T. and Swami, A., “Mining Association Rules Between Sets of Items in Large Databases,” Proc. of the ACM SIGMOD International Conf. on Management of Data, 1993, pp. 207-216.
Agrawal, R. and Srikant, R., “Fast Algorithms for Mining Association Rule, ” Proc. of the 20th International Conf. on Very Large Data Bases, 1994, pp. 487-499.
Agrawal, R. and Srikant, R., “Mining Quantitative Association Rules in Large Relational Tables,” Proc. of the ACM-SIGMOD 1996 Conference on Management of Data, Montreal, Canada, June 1996.
Berry, M., and Linoff, G., Data Mining Techniques for Marketing, Sales and Customer Support, John Wiley and Sons, New York, 1997.
Berry, M. and Linoff, G., Mastering Data Mining: The Art & Science of Customer Relationship Management, John Wiley &Sons, New York, NY, 2000.
Berson, A., Smith, S., and Thearling, K., Building Data Mining Applications for CRM, McGraw-Hill, 2000.
Brachman, R. J., T. Khabaza, W. Kloesgen, G. Piatetsky-Shapiro and E. Simoudis “Mining business database,” Communication of ACM(39:11), 1996, pp: 42- 48.
Brada, D., and Shmilovici A., “Data mining for improving a cleaning process in the semiconductor industry,” IEEE Transactions on Semiconductor Manufacturing, Vol. 15, No. 1, 2002, pp. 91-101.
Braha, D. and Shmilovici, A., “On the Use of Decision Tree Induction for Discovery of Interactions in a Photolithographic Process”, IEEE Transactions On Semiconductor Manufacturing, Vol. 16, No. 4, 2003, pp.644-652.
Breiman, L., Friedman, J. H., Olshen, R. J., and Stone, C. J., Classification and Regression Trees, Belmont, CA:Wadsworth, 1984.
Cabena, P., Hadjinian P., Stadler R., Verhess J., and Zanasi A., Discovering Data Mining From Concept to Implementation, Prentice Hall PTR, Upper Saddle River, New Jersey, 1997.
Chen, M. C.(2003), “Configuration of Cellular Manufacturing Systems Using Association Rule Induction,” International Journal of Production Research, 41 (2), pp. 381–395.
Chen, M.S., Han, J., and Yu, P.S., “Data Mining:An Overview from a Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, 8(6), 1996, pp. 866-883.
Chen, Yun-Shiow and Min-Xun Guo, 2002, “A Study for the Determination of Threshold Value of Supportability in Association Rule – From Cost Viewpoint,” The 16th Asia Quality Symposium 2002, Japan, November 2002.
Cunningham, S. P., Spanos, C. J., and Voros, K., “Semiconductor Yield Improvement: Results and Best Practices ”, IEEE Transactions on Semiconductor Manufacturing, 8(2), 1995, pp.103-109.
Cherkassky, V., and Mulier, F., Learning From Data: Concepts, Theory, and Methods, A Wiley Interscience Publication, 1998.
Chang Man Kuok, Ada Wai-Chee Fu, Man Hon Wong, " Mining Fuzzy Association Rules in Databases," SIGMOD Record 27(1), 1998, pp. 41-46
Chih-Min Fan, Ruey-Shan Guo, Chen, A., Kuo-Ching Hsu, Chih-Shih Wei, “Data mining and fault diagnosis based on wafer acceptance test data and in-line manufacturing data,” Semiconductor Manufacturing Symposium, 2001 IEEE International, 8-10, 2001, pp. 171-174.
Chih-Min Fan, Ruey-Shan Guo, Chen, A., Hon, A., Wei, J., Mingchu King, “Integrated yield-mining solution with enhanced electrical test data correlation,” Semiconductor Manufacturing, 2003 IEEE International Symposium on, 30, 2003, pp. 497-500.
Daniel, W. W., Applied Nonparametric Statistics 2nd, PWS-KENT Publishing Company, Boston, 1990.
Dhar V. and Stein R., Seven Methods for Transforming Corporate Data into
Business Intelligence, Upper Saddle River, New Jersey, 1997.
Elovici, Y.; Braha, D., “A decision-theoretic approach to data mining”, Systems, Man and Cybernetics, Part A, IEEE Transactions on , Vol. 33(1), 2003, pp. 42 – 51.
Feyyad, U.M., “ Data mining and knowledge discovery: making sense out of data”, IEEE Expert, Vol. 11 , no. 5 , 1996(a), pp.20 – 25
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., “ The KDD Process for Extracting Useful Knowledge from Volumes of Data ”, Communication of ACM, vol. 39, no.11, 1996(b), pp.27-34.
Fayyad, U. “Data Mining and Knowledge Discovery in Database: Implication for Scientific Databases”, Scientific and Statistical Database Management 1997, pp: 2-11.
Feelders, A., Daniels, H. and Holsheimer, M., “Methodological and practical aspects of data mining”, Information and Management, vol. 37(5), 2000, pp. 271-281.
Grader R. M., Bieker J. and Elwell S., “Solving tough semiconductor manufacturing problems using data mining,” ASMC 2000 proceedings, pp. 46-55.
Han, J. and M. Kamber, Data Mining: Concepts and Techniques, San Francisco, CA, USA: Morgan Kaufmann, 2000.
Han, J. and Yongjian, “Discovery of Multiple-Level Association Rules from Large Databases,” Proceddings of the 21st VLDB Conference, 1995.
Jong, S. P., Chen, Ming-Syan, and Yu, P. S., “An Effective Hash-Based Algorithm for Mining Association Rules,” Proc ACM SIGMOD, 1995, pp.175-186.
John, G. H., Miller, P., and Kerber, R., “Stock Selection Using Rule Induction”, IEEE Expert, vol. 11, no. 5, 1996, pp. 52-58.
Kawahara, M. and Kawano H., “Mining association algorithm with Improved threshold based on ROC analysis,” Computers and signal Processing, 2001 IEEE Pacific Rim Conference on, Vol. 2(26-28), 2001, pp. 703-706.
Krivda, Chery D., “Unearthing underground data,”LAN, May 1996, pp.42-48.
Kusiak, A.and Kurasek, C., “Data mining of printed-circuit board defects”, Robotics and Automation, IEEE Transactions on, Vol. 17(2), 2001, pp. 191 – 196.
Liu, B., Hsu, W., Ma, Y., “Mining association rules with multiple minimum supports,” In: Proceedings of the ACM SIGKDD (KDD-99), pp. 337-341.
Lu, H., Han, J. and Feng L., “Stock Movement Prediction And N-Dimensional Inter-Transaction Association Rules,” Proc. of 1998 SIGMOD'96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'98), Seattle, Washington, June 1998, pp.12:1-12:7.
Milne, R., M. Drummond and P. Renoux, “Predicting paper making defect on-line using data mining,” Knowledge-Based Systems (11) 1998, pp. 331-338.
Pyle, D., Data Preparation for Data Mining, Morgan Kaufmann Publishers, San Francisco, California, 1999.
Quinlan, J. R., “Induction of decision tree,” Machine Learning, 1, 1986, pp. 81-106.
Raghavan, V., “Application of decision trees for integrated circuit yield improvement,” Advanced Semiconductor Manufacturing 2002 IEEE/SEMI Conference and Workshop, 30, 2002, pp. 262–265.
Sankaran, V., Weber, C. M., Tobin Jr., K. W. and Lakhani, F., “Inspection in semiconductor manufacturing,” in Webster’s Encyclopedia of Electrical and Electronic Engineering. New York: Wiley, vol. 10, 1999, pp. 242–262.
Savasere, A., Omiecinski, E. and S. Navathe (1995), “An Efficient Algorithm for Mining Association Rule in Large Databases,” Proceeding of 21th VLDB, pp. 432-444.
Sharma, S., Applied Multivariate Techniques, John Wiley & Sons, New York, 1996.
Shaw, M. J., Subramaniam, C. and Tan, G. W., “Knowledge management and data mining for marketing,” Decision Support Systems, 31, 2001, 127-137.
Smith, M.H. and Pedrycz, W., “Expanding the meaning of and applications for data mining”, Systems, Man, and Cybernetics, 2000 IEEE International Conference on, Vol.3 , 2000, pp. 1874.
Song, H. S., Kim, J. k. and Kim, S. H., “Mining the change of customer behavior in an internet shopping mall”, Expert Systems with Applications, Vol. 21(3), 2001, pp. 157-168.
Stapper, C.H., and Ronser, R. J. (1995), “Integrated Circuit Yield Management and Yield Analysis : Development and implementation,” IEEE Transactions on semiconductor Manufacturing, 8(2), 95-102.
Sung, H. H. and Sang, C. P., “Application of data mining tools to hotel data mart on the Intranet for database marketing”, Expert Systems with Applications, Vol. 15(1), 1998, pp. 1-31.
Thuraisingham, B., “A primer for understanding and applying data mining”, IT Professional, Vol. 2(1), 2000, pp. 28–31.
Tsuda H., Shiri H., Takagi O. and Take R., “Yield analysis and improvement by reducing manufacturing fluctuation noise,” ISSM 2000 proceeding, pp. 249-251.
Vazquez, E., Altuve, H.J. and Chacon, O.L., “Neural network approach to fault detection in electric power systems”, Neural Networks, IEEE International Conference on , Vol. 4, 1996, pp. 2090 – 2095.
Wur, Suh-Ying and Yungho Leu, “An effective Boolean algorithm for mining association rules in large databases,” Proceedings of the 6th International Conference on Database Systems for Advanced Applications, 1999, pp.179 –186.
Yun, Hyunyoon, Ha, Danshim, Hwang, Buhyun and Ho Ryu, Keun, “Mining association rules on significant rare data using relative support,” Journal of Systems and Software, Vol. 67(3), 2003, pp. 181-191.
Zormana, M., Masudab, G., Kokola, P., Yamamotob, R. and B. Stiglica, “Mining Diabetes Database With Decision Trees and Association Rules,” Computer-Based Medical Systems 2002, Proceedings of the 15th IEEE Symposium on, 2002, pp. 134 – 139.