研究生: |
林宗儀 LIN, ZONG-YI |
---|---|
論文名稱: |
應用虛擬量測系統於半導體製程錯誤診斷 Applying Virtual Metrology System for Fault Detection in Semiconductor Manufacturing Process |
指導教授: |
陳榮順
CHEN, RONG-SHUN |
口試委員: |
白明憲
BAI, MING-SIAN 黃榮堂 HUANG, JUNG-TANG |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 動力機械工程學系 Department of Power Mechanical Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 74 |
中文關鍵詞: | 虛擬量測系統 、錯誤診斷模型 、K折交叉驗證法 、Holdout驗證法 、SMOTETomek法 |
外文關鍵詞: | Virtual Metrology System, Fault Detection Model, K-Fold Cross Validation, Holdout method, SMOTETomek method |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
當晶圓代工廠投片生產後,為了確保晶圓上元件的電性參數介於容許範圍內,會對晶圓進行允收測試(Wafer Acceptance Test),目前為了避免逐片檢驗造成時間成本的增加,而多半採取抽測檢驗,又為了確保未抽測晶圓一樣符合規格,因而需建立一個能夠透過製程參數資料,便能預測製程結果的虛擬量測系統做為輔助判斷工具。本研究利用加州爾灣大學(UCI University)機器學習資料庫中的一筆半導體製程資料,建構一虛擬量測系統,並提出四種錯誤診斷模型PCA-SVM、KPCA-SVM、PCA-SMOTETomek-SVM及KPCA-SMOTETomek-SVM,對製程結果進行錯誤診斷。在模型建立過程中,於訓練集和驗證集的拆分中,比較K折交叉驗證法和Holdout驗證法間的差異。因為資料集為不平衡資料集,所以藉由增加少數抽樣的SMOTE法及減少多數抽樣的Tomek法,改善模型的診斷能力。研究結果顯示K折交叉驗證法會使四種錯誤診斷擁有較佳的模型評估指標值,且面對不同次資料拆分下之驗證集時,其製程診斷能力較為穩定,而在K折交叉驗證法下之四種錯誤診斷模型中,又以KPCA-SMOTETomek-SVM模型擁有最佳的模型評估指標值。
In order to ensure that the wafer's electrical parameters are within the allowable range after process, the elements on the wafer needs to be tested. At present, we will periodically randomize the wafer to test. If the result is in the allowable range, we can assume that the other wafers which in the same lot are passed. But this assumption is risky. Because the process’ parameters may be drifting over time, so even in the same lot of the process there may be different results. So we need a Virtual Metrology System to understand the results of the wafers which are not be tested.
In this paper, we use a semiconductor process data in the machine learning database of UCI University to build the Virtual Metrology System and then construct four different fault detection models PCA-SVM, KPCA-SVM, PCA-SMOTETomek-SVM and KPCA-SMOTETomek-SVM to detect process’ result. In the process of model establishment, first of all we focus on the difference between K-fold cross validation and Holdout method and then in order to improve the performance of the model, we use SMOTETomek method to deal with the imbalanced data problem. In summary, the results show that the K-fold cross validation not only make four different model to have better performance in F1 score but also make models more stable than Holdout method. And the KPCA-SMOTETomek-SVM model has the best performance in F1 score in four different models under K-fold cross validation method.
[1] P. Kang, H. J. Lee, S. Cho, and D. Kim, “A virtual metrology system for semiconductor manufacturing,” Expert Systems with Applications, vol. 36, pp. 12554-12561, 2009.
[2] 半導體製程資料集,https://archive.ics.uci.edu/ml/machine-learning-databases/secom,12月,2016。
[3] L. Hou and N. W. Bergmann, “Novel Industrial Wireless Sensor Networks for Machine Condition Monitoring and Fault Diagnosis,” IEEE Transactions on Instrumentation and Measurement, vol. 61, pp. 2787 - 2798, 2012.
[4] A. Ali, A. Q. Khan, B. Hussain, M. T. Raza, and M. Arif, “Fault modelling and detection in power generation, transmission and distribution systems,” IET Generation, Transmission & Distribution, vol. 9, pp. 2782-2794, 2015.
[5] Q. P. He and J. Wang, “Fault Detection Using the k-Nearest Neighbor Rule for Semiconductor Manufacturing Processes,” IEEE Transactions on Semiconductor Manufacturing, vol. 20, pp. 345-354, 2007.
[6] Z. Cheng and L. Yuan, “The Application and Research of Fault Detection Based on PC-KNN in Semiconductor Batch Process,” presented at the Control and Decision Conference (CCDC), 2013.
[7] Z. Zhou, C. Wen, and C. Yang, “Fault Detection Using Random Projections and k-Nearest Neighbor Rule for Semiconductor Manufacturing Processes,” IEEE Transactions on Semiconductor Manufacturing, vol. 28, pp. 70-79, 2015.
[8] D. Fradkin and D. Madigan, “Experiments with random projections for machine learning,” presented at the KDD Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003.
[9] T. Sarmiento, S. J. Hong, and G. S. May, “Fault detection in reactive ion etching systems using one-class support vector machines,” IEEE/SEMI Conference and Workshop on Advanced Semiconductor Manufacturing pp. 139-142, 2005.
[10] T. Lee and C. O. Kim, “Statistical Comparison of Fault Detection Models for Semiconductor Manufacturing Processes,” IEEE Transactions on Semiconductor Manufacturing vol. 28, pp. 80-91, 2015.
[11] N. Sundaram, “Support Vector Machine Approximation using Kernel PCA,” Electrical Engineering and Computer Sciences University of California at Berkeley, USA2009.
[12] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning Data Mining,Inference,and Prediction, 2009.
[13] B. Efron, “Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation,” American Statistical Association, vol. 78, pp. 316-331, 1983.
[14] T. Astuti, H. A. Nugroho, and T. B. Adji, “The impact of different fold for cross validation of missing values imputation method on hepatitis dataset,” International Conference on Quality in Research, pp. 51-55, 2015.
[15] Z. Nematzadeh, R. Ibrahim, and A. Selamat, “Comparative studies on breast cancer classifications with k-fold cross validations using machine learning techniques,” 10th Asian Control Conference, pp. 1-6, 2015.
[16] S. Yadav and S. Shukla, “Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification,” 6th International Conference on Advanced Computing pp. 78-83, 2016.
[17] P. K. Chan, W. Fan, A. L. Prodromidis, and S. J. Stolfo, “Distributed data mining in credit card fraud detection,” IEEE Intelligent Systems and their Applications, vol. 14, pp. 67-74, 1999.
[18] K. U. Rani, G. N. Ramadevi, and D. Lavanya, “Performance of synthetic minority oversampling technique on imbalanced breast cancer data,” 3rd International Conference on Computing for Sustainable Global Development pp. 1623-1627, 2016.
[19] H. He, “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering vol. 21, pp. 1263-1284, 2009.
[20] M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: One-Sided Selection,” In Proceedings of the Fourteenth International Conference on Machine Learning, 1997.
[21] Tomek and Ivan, “TWO MODIFICATIONS OF CNN,” IEEE Transactions on Systems, Man and Cybernetics, vol. SMC-6, pp. 769-772, 1976.
[22] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[23] H. Han, W.-Y. Wang, and B.-H. Mao, “Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning,” Advances in Intelligent Computing, pp. 878-887, 2005.
[24] C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem,” Advances in Knowledge Discovery and Data Mining, pp. 475-482, 2009.
[25] K. Matsuda and K. Murase, “Single-Layered Complex-Valued Neural Network with SMOTE for Imbalanced Data Classification,” 8th International Conference on Soft Computing and Intelligent Systems and 17th International Symposium on Advanced Intelligent Systems pp. 349-354, 2016.
[26] A. J. Izenman, Modern Multivariate Statistical Techniques, Springer New York, 2008.
[27] B. Schölkopf, A. Smola, and K.-R. Müller, "Nonlinear component analysis as a kernel eigenvalue problem," Neural Computation, vol. 10, pp. 1299-1319, July 1, 1998.
[28] V. N. Vapnik, The Nature of Statistical Learning Theory, USA: Springer New York, 1995.
[29] J. Kim, Y. Han, and J. Lee, “Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process,” Information Technology and Computer Science, vol. 133, pp. 79-84, 2016.