運用電磁演算法於屬性篩選：理論與應用｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	林鴻鈞 Lin, Hung-Chun
論文名稱：	運用電磁演算法於屬性篩選：理論與應用 Applying Electromagnetism-like Mechanism Algorithm to Feature Selection: Theory and Application
指導教授：	蘇朝墩 Su, Chao-Ton
口試委員:
學位類別：	博士 Doctor
系所名稱：	工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management
論文出版年：	2011
畢業學年度：	99
語文別：	英文
論文頁數：	68
中文關鍵詞：	屬性篩選、NP完備問題、電磁演算法、隨機性缺失、穩健性貝氏分類器
外文關鍵詞：	Feature selection, NP-complete problem, Electromagnetism-like Mechanism algorithm, Missing at random, Robust Bayes Classifier
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在資料探勘的範疇裡，有越來越多高維度的問題產生。在分析這類問題時，便需要更多的計算時間與成本。近幾年為了解決此種困難，許多有關屬性篩選的方法被提出來。根據計算方式的不同，可以將屬性篩選方法分為：過濾型模式與包裝型模式。其中，包裝型模式是以分類器所產生的分類錯誤為基準來篩選屬性，並且專注在將分類器的分類錯誤率降到最小。屬性篩選議題被視為NP完備問題，因此，有許多學者提出啟發式演算法來解決屬性篩選問題。
Birbil與Fang於2003年提出電磁演算法，此方法是利用電磁原理的吸力與排斥力機制來搜尋最佳解。目前為止，電磁演算法大都應用於搜尋連續空間中的最佳解而另外有少部分研究是應用於離散問題；但尚未被用來解決屬性篩選問題，本研究結合電磁演算法與最近相鄰法來做分類與屬性篩選，並用資料庫中的資料來驗證電磁演算法於處理完整資料上時的屬性篩選能力。接著，我們應用所提出之方法於一個有關妊娠糖尿病的實務案例上，其結果顯示，本研究之方法於真實案例上確實可行。
另一方面，由於實務上的資料會因許多原因而有所缺失，因此，處理不完整資料的分類演算法越來越被重視。目前已有許多方法在處理不完整資料，但或多或少都有其缺點；或者有前提假設，即是資料須為隨機性缺失的情況，而此假設是相當難以證明的。為了避免此假設，Ramoni與Sebastiani提出「穩健性貝氏分類器」；但此方法又有另一個前提，便是每個屬性必須與各類別獨立。若此假設不成立，分類器的績效便會大幅度的降低。為了改善此分類器的績效，本研究結合電磁演算法來搜尋最佳屬性子集。同樣地，為了驗證本研究在不完整資料上，屬性篩選之能力，本研究使用數筆不完整資料來加以實驗。
經由以上關於這兩方面議題的執行結果顯示，本研究所提出之方法，不論在完整與不完整資料的分析上，其屬性篩選的能力都是相當優秀且穩健的。

Nowadays, high dimensional problems have been increasingly occurring in field of data mining, which increases the computation time and cost. In order to deal with this kind of problems, various feature selection methods have been developed in recent years. Based on the differences between computations, the feature selection techniques could be grouped into two: the filter model and the wrapper model. Among them, the wrapper model regards the error produced by the classifier as a criterion for feature selection and focuses on minimizing the miss-classification of the classifier. The issues about features selection are regarded as NP-complete problems. Therefore, many meta-heuristics are proposed for feature selection.
The Electromagnetism-like Mechanism (EM) algorithm is proposed by Birbil and Fang in 2003. It makes use of the attraction-repulsion mechanism of the electromagnetism theory to find the optimal solution. So far, EM has been applied to optimization in continuous space and discrete problems, yet the study on feature selection is not found. This study applies EM and combines 1-nearest-neighbor (1NN) for feature selection and classification. A numerical experiment is carried out to verify the feasibility of the EM algorithm with complete data. Then, a real case concerning gestational diabetes mellitus is introduced, and the outcomes demonstrate that the proposed method is workable in the real world case.
On the other hand, actual data sets are generally incomplete because of various reasons. Consequently, algorithms for classification issues with incomplete data have received increasing attention. Many methods have been developed to deal with incomplete data. However, these approaches have either some drawbacks or the pre-assumption of missing at random (MAR) for the data, which is difficult to verify. Ramoni and Sebastiani presented Robust Bayes Classifier (RBC) which could eliminate the assumption. Nevertheless, RBC assumes that the attributes are independent for each class. If this assumption is violated, the performance of classification would be degenerated. Therefore, this study applies the combination of the EM algorithm and RBC to find the feature subset with the best performance. Another numerical experiment is carried out to verify the feasibility of EM for feature selection with incomplete data.
The implementation results of above two issues showed that the EM algorithm is useful and effective for feature selection with both complete and incomplete data.

摘要    i
ABSTRACT    iii
誌謝    v
CONTENTS    vi
TABLES    viii
FIGURES    x
  INTRODUCTION    1
1 Overview and Motivations    1
2 Objectives    5
3 Framework and Organization    6
  RELATED WORKS    8
1 Wrapper Model    8
2 Electromagnetism-like Mechanism Algorithm    9
3 Robust Bayes Classifiers    12
  PROPOSED APPROACH    15
  PERFORMANCE ANALYSES    23
1 Performance Indices    23
2 Performance Evaluation of Hybrid Method with Complete Data    26
2.1 Effects of Parameters at Different Levels    27
2.2 Numerical Experiments    29
3 Performance Evaluation of Hybrid Method with Incomplete Data    38
3.1 Effects of Parameters at Different Levels    39
3.2 Numerical Experiments    40
4 Discussions    47
  A CASE STUDY ABOUT DIABETES MELLITUS PREDICTION    50
1 Case Description    50
2 Data Collection and Analysis    51
3 Concluding Remarks    54
  CONCLUSIONS    55
REFERENCES    60

                                

[1] S. Puuronen, A. Tsymbal, and I. Skrypnyk, “Advanced Local Feature Selection in Medical Diagnostics,” Proceeding of 13th IEEE Symposium on Computer-Based Medical Systems, pp. 25-30, 2000.
[2] M. Dash and H. Liu, “Feature Selection for Clustering,” Proceeding Pacific Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 110-121, 2000.
[3] H.M. Lee, C.M. Chen, J.M. Chen, and Y.-L. Jou, “An Efficient Fuzzy Classifier with Feature Selection Based on Fuzzy Entropy,” IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 31(3), pp. 426-432, 2001.
[4] M.A. Chamjangali, M. Beglari, and G. Bagherian,” Prediction of Cytotoxicity Data (CC50) of Anti-HIV 5-pheny-1-phenylamino-1H-imidazole Derivatives by Artificial Neural Network Trained with Levenberg-marquardt Algorithm,” Journal of Molecular Graphics and Modelling, vol. 26(1), pp. 360-367, 2007.
[5] A. Rahman, M. Murshed, and L.S. Dooley, “Feature Weighting Methods for Abstract Features Applicable to Motion Based Video Indexing,” Proceeding of the International Conference on Information Technology: Coding and Computing, Las Vegas, NV, pp. 676-680, 2004.
[6] C.Z. Wang, C.X. Wu, and D.G. Chen, “A Systematic Study on Attribute Reduction with Rough Sets Based on General Binary Relations,” Information Sciences, vol. 178(9), pp. 2237-2261, 2008.
[7] X.Y. Wang, J. Yang, X.L. Teng, W.J. Xia, and R. Jensen, “Feature Selection Based on Rough Sets and Particle Swarm Optimization,” Pattern Recognition Letters, vol. 28(4), pp. 459-471, 2007.
[8] B. Apolloni, B. Simone, and B. Andrea, “Feature Selection via Boolean Independent Component Analysis,” Information Sciences, vol. 179(22), pp. 3815-3831, 2009.
[9] Q.Z. Liu, A.H. Sung, M.Y. Qiao, Z.X. Chen, and B. Ribeiro, “An Improved Approach to Steganalysis of JPEG Images,” Information Sciences, vol. 180(9), pp. 1643-1655, 2010.
[10] H. Stoppiglia, G. Dreyfus, R. Dubois, and Y. Oussar, “Ranking a Random Feature for Variable and Feature Selection,” Journal of Machine Learning Research, vol. 3, pp. 1399-1414, 2003.
[11] Z.X. Zhu, Y.S. Ong, and M. Dash, “Wrapper-filter Feature Selection Algorithm Using a Memetic Framework,” IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 37(1), pp. 70-76, 2007.
[12] J.R. Quinlan, C 4.5: Programs for Machine Learning, Morgan Kaufman, San Francisco, CA, 1993.
[13] H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Boston Kluwer, USA, 1998.
[14] M. Brown and N.P. Costen, “Exploratory Basis Pursuit Classification,” Pattern Recognition Letters, vol. 26(12), pp. 1907-1915, 2005.
[15] X.W. Chen, “An Improved Branch and Bound Algorithm for Feature Selection,” Pattern Recognition Letters, vol. 24(12), pp. 1925-1933, 2003.
[16] H.B. Zhang and G.G. Sun, “Feature Selection Using Tabu Search Method,” Pattern Recognition, vol. 35(3), pp. 701-711, 2002.
[17] S.W. Lin, Z.J. Lee, S.C. Chen, and T.Y. Tseng, “Parameter Determination of Support Vector Machine and Feature Selection Using Simulated Annealing Approach,” Applied Soft Computing, vol. 8(4), pp. 1505-1512, 2008.
[18] I.S. Oh, J.S. Lee, and B.R. Moon, “Hybrid Genetic Algorithm for Feature Selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26(11), pp. 1424-1437, 2004.
[19] M.L. Zhang, J.M. Pena, and V. Robles, “Feature Selection for Multi-label Naïve Bayes Classification,” Information Sciences, vol. 179(19), pp. 3218-3229, 2009.
[20] S. Geetha, N. Ishwarya, and N. Kamaraj, “Evolving Decision Tree Rule Based System for Audio Stego Anomalies Detection Based on Hausdorff Distance Statistics,” Information Sciences, vol. 180(13), pp. 2540-2559, 2010.
[21] A. Jain and D. Zongker, “Feature Selection: Evaluation, Application, and Small Sample Performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19(2), pp. 153-158, 1997.
[22] L. Davis, Genetic Algorithm and Simulated Annealing, Pitman publishing, London, 1987.
[23] J.Y. Koo, C. Park, and M. Jhun, “A Classification Spline Machine for Building a Credit Scorecard,” Journal of Statistical Computation and Simulation, vol. 79(5), pp. 681-689, 2009.
[24] C.S. Tsou and C.H. Kao, “An Electromagnetism-like Meta-heuristic for Multi-objective Optimization,” Proceeding of 2006 IEEE International Conference on Evolutionary Computation, Vancouver, BC, Canada, pp. 1172-1178, 2006.
[25] P. Langley, W. Iba, and K. Thompson, “An Analysis of Bayesian Classifiers,” Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223-228, 1992.
[26] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis, Wiley, New York, 1973.
[27] R. Kohavi, B. Becker, and D. Sommerfield, “Improving Simple Bayes,” M. van Someren, G. Widmer (Eds.), Poster Papers of the ECML-97, Charles University, Prague, pp. 78-87, 1997.
[28] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal Royal Statistical Society. Series B (Methodology), vol. 39(1), pp. 1-38, 1977.
[29] I. Stanimirova and B. Walczak, “Classification of Data with Missing Elements and Outliers,” Talanta, vol. 76(3), pp. 602-609, 2008.
[30] I. Stanimirova, M. Daszykowski, and B. Walczak, “Dealing with Missing Values and Outliers in Principal Component Analysis,” Talanta, vol. 72(1), pp. 172-178, 2007.
[31] S. Serneels and T. Verdonck, “Principal Component Analysis for Data Containing Outliers and Missing Elements,” Computation Statistics & Data Analysis, vol. 52(3), pp. 1712-1727, 2008.
[32] S. Rässler, “The Impact of Multiple Imputation for DACSEIS,” Technical Report DACSEIS Research Paper Series 5, Univ. of Erlangen-Nürnberg, Nürnberg, Germany, 2004.
[33] D. Williams, X. Liao, Y. Xue, L. Carin, and B. Krishnaouram, “On Classification with Incomplete Data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29(3), pp. 427-436, 2007.
[34] R.J.A. Little and D.B. Rubin, Statistical Analysis with Missing Data, Wiley, New York, 1987.
[35] M. Ramoni and P. Sebastiani, “Robust Bayes Classifiers,” Artificial Intelligence, vol. 125(1-2), pp. 209-226, 2001.
[36] M. Ramoni and P. Sebastiani, “Robust Learning with Missing Data,” Machine Learning, vol. 45(2), pp. 147-170, 2001.
[37] J.N. Chen, X.P. Xue, F.Z. Tian, and H.K. Huang, “An Algorithm for Classifying Incomplete Data with Selective Bayes Classifiers,” Proceeding of 2007 International Conference on Computational Intelligence and Security Workshops, pp. 445-448, 2007.
[38] P.H. Winston, Artificial Intelligence, Addison-Wesley, Reading, MA, 1992.
[39] T. Czekaj, W. Wu, and B. Walczak, “Classification of Genomic Data: Some Aspects of Feature Selection” Talanta, vol. 76(3), pp. 564-574, 2008.
[40] J. Nikbakhsh, G.A. Mohsen, and T.M. Reza, “A Discrete Binary Version of the Electromagnetism-like Heuristic for Solving Traveling Salesman Problem,” Lecture Notes in Computer Science, vol. 5227, pp. 123-130, 2008.
[41] Z.Z. Yao and W.L Ruzzo, “A Regression-based k Nearest Neighbor Algorithm for Gene Function Prediction from Heterogeneous Data,” BMC Bioinformatics, vol. 7 (suppl 1), p. S11, 2006.
[42] B.V. Dasarathy, Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, IEEE Computer Society Press, 1991.
[43] D. Wettschereck and T.G. Dietterich, “An Experimental Comparison of the Nearest-neighbor and Nearest-hyperrectangle Algorithms,” Machine Learning, vol. 19(1), pp. 5-27, 1995.
[44] L.I. Kuncheva and J.C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search?” IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, vol. 28(1), pp. 160-164, 1998.
[45] L.I. Kuncheva and L.C. Jain, “Nearest Neighbor Classifier: Simultaneous Editing and Feature Selection,” Pattern Recognition Letters, vol. 20(11-13), pp. 1149-1156, 1999.
[46] J.H. Chen, H.M. Chen, and S.Y. Ho, “Design of Nearest Neighbor Classifiers: Multi-objective Approach,” International Journal of Approximate Reasoning, vol. 40(1-2), pp. 3-22, 2005.
[47] M.A. Tahir, A. Bouridane, F. Kurugollu, and A. Amira, “A Novel Prostate Cancer Classification Technique Using Intermediate Memory Tabu Search,” EURASIP Journal on Applied Signal Processing, vol. 2005(14), pp. 2241-2249, 2005.
[48] Ş.İ. Birbil and S.C. Fang, “An Electromagnetism-like Mechanism for Global Optimization,” Journal of Global Optimization, vol. 25, 263-282, 2003.
[49] A.H.G.R. Kan and G.T. Timmer, “Stochastic Global Optimization Methods, part I: Clustering Methods,” Mathematical Programming, vol. 39, pp. 27-56, 1987.
[50] A. Törn and S. Viitanen, “Topographical Global Optimization Using Pre-sampled Points,” Journal of Global Optimization, vol. 5, pp. 267-276, 1994.
[51] R. Fletcher and C. Reeves, “Function Minimization by Conjugate Directions,” Computer Journal, vol. 7, pp. 149-154, 1964.
[52] E.W. Cowan, Basic Electromagnetism, Academic Press, New York, 1968.
[53] A.M.A.C. Rocha and E.M.G.P. Fernandes, “Modified Movement Force Vector in an Electromagnetism-like Mechanism for Global Optimization,” Optimization Methods and Software, vol. 24(2), pp. 253-270, 2009.
[54] M.G. Alikhani, N. Javadian, and R. Tavakkoli-Moghaddam, “A Novel Hybrid Approach Combining Electromagnetism-like Method with Solis and Wets Local Search for Continuous Optimization Problems,” Journal of Global Optimization, vol. 44(2), pp. 227-234, 2009.
[55] X.J. Wang, L. Gao, and C.Y. Zhang, “Electromagnetism-like Mechanism Based Algorithm for Neural Network Training,” Lecture Notes in Computer Science, vol. 5227, pp. 40-45, 2008.
[56] P. Wu, K.J. Yang, and B.Y. Huang, “A Revised EM-like Mechanism for Solving the Vehicle Routing Problems,” Proceeding of the Second International Conference on Innovative Computing, Information and Control, Kumamoto, Japan, p. 181, 2007.
[57] K. Yuan, S. Henequin, X.J. Wang, and L. Gao, “A New Heuristic-EM for Permutation Flowshop Scheduling,” Proceeding of the 12th IFAC Symposium on Information Control Problems in Manufacturing (INCOM06), Saint-Etienne, France, pp. 33-36, 2006.
[58] P.E. Utgoff, “Incremental Induction of Decision Trees,” Machine Learning, vol. 4(2), pp. 161-186, 1989.
[59] T.P. Hong and S.S. Tseng, “Models of Parallel Learning Systems,” Proceeding 1991 IEEE International Conference Distributed Computing Systems, pp. 125-132, 1991.
[60] J.N. Chen, H.K. Huang, F.Z. Tian, and S.F. Tian, “A Selective Bayes Classifier for Classifying Incomplete Data Based on Gain Ratio,” Knowledge-Based Systems, vol. 21(7), pp. 530-534, 2008.
[61] T.H. Lin, H.T. Li, and K.C. Tsai, “Implementing the Fisher’s Discriminant Ratio in a k-Means Clustering Algorithm for Feature Selection and Data Set Trimming,” Journal of Chemical Information and Computer Sciences, vol. 44(1), pp. 76-87, 2004.
[62] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, New York, 1999.
[63] T.S. Cheong and C.H. Yoon, “A Memory Based Classifier Using the Recursive Partition Averaging,” IEEE Tencon, pp. 1038-1041, 1999.
[64] W. Wei, Artificial Neural Network, Beijing Aerospace University Press, Beijing, 1995.
[65] C.C. Chang and C.J. Lin, LIBSVM: A Library for Support Vector Machines, 2001. LIBSVM version 2.91 from <http://www.csie.ntu.edu.tw/~cjlin/libsvm/>.
[66] C.W. Hsu and C.J. Lin, “A Comparison of Methods for Multiclass Support Vector Machines,” IEEE Transactions on Neural Networks, vol. 12(2), pp. 415-425, 2002.
[67] R. Duda, P. Hart, and D. Stork, Pattern Classification, New York, Wiley, 2000.
[68] V. Tirumalai, K.G. Ricks, and K.A. Woodbury, “Using Parallelization and Hardware Concurrency to Improve the Performance of a Genetic Algorithm,” Concurrency and Computation- Practice and Experience, vol. 19(4), pp. 443-462, 2007.
[69] H.A. Vrooman, C.A. Cocosco, F. Lijn, R. Stokking, M.A. Ikram, M.W. Vernooij, M.M.B. Breteler, and W.J. Niessen, “Multi-spectral Brain Tissue Segmentation Using Automatically Trained k-Nearest-neighbor Classification,” NeuroImage, vol. 37(1), pp. 71-81, 2007.
[70] American Diabetes Association, “Gestational Diabetes Mellitus,” Diabetes Care, vol. 26(Suppl 1), pp. s103-s105, 2003.
[71] American Diabetes Association, “Screening for Type 2 Diabetes,” Diabetes Care, vol. 26(Suppl 1), pp. s21-s24, 2003.
[72] S.L. Kjos, R.K. Peters, A. Xiang, O.A. Henry, M. Montoro, and T.A. Buchanan, “Predicting Future Diabetes in Latino Women with Gestational Diabetes,” Diabetes, vol. 44(5), pp. 586-591, 1995.
[73] B.E. Metzger, N.H. Cho, S.M. Rston, and R. Radvany, “Pregnancy Weight and Antepartum Insulin Secretion Predict Glucose Tolerance Five Years after Gestational Diabetes Mellitus,” Diabetes Care, vol. 16, pp. 1598-1605, 1993.
[74] M. Albareda, A. Caballero, G. Badell, S. Piquer, A. Ortiz, A. de Leiva, and R. Corcoy, “Diabetes and Abnormal Glucose Tolerance in Women with Previous Gestational Diabetes,” Diabetes Care, vol. 26(4), pp. 1199-1205, 2003.
[75] T.A. Buchanan, A. Xiang, S.L. Kjos, W.P. Lee, E. Trigo, I. Nader, E.A. Bergner, J.P. Palmer, and R.K. Peter, “Gestational Diabetes: Antepartum Characteristics that Predict Postpartum Glucose Intolerance and Type 2 Diabetes in Latino Women,” Diabetes, vol. 47(8), pp. 1302-1310, 1998.
[76] W.A. Young and G.R. Weckman, “Using a Heuristic Approach to Derive a Grey-box Model through an Artificial Neural Network Knowledge Extraction Technique,” Neural Computing and Applications, vol. 19(3), pp. 353-366, 2010.
[77] M. Khanmohammadi, A.B. Garmarudi, N. Khoddami, K. Shabani, and M. Khanlari, “A Novel Technique Based on Diffuse Reflectance Near-infrared Spectrometry and Back-propagation Artificial Neural Network for Estimation of Particle Size in TiO2 Nano Particle Samples,” Microchemical Journal, vol. 95(2), pp. 337-340, 2010.
[78] L. Devroye, L. Györfi, and G. Lugosi, A Probability Theory of Pattern Recognition, New York, Springer, 1996.
[79] F.E.H. Tay and L. Shen, “Fault Diagnosis Based on Rough Set Theory,” Engineering Applications of Artificial Intelligence, vol. 16(1), pp. 39-43, 2003.
[80] R. Abraham, J.B. Simha, and S.S. Iyengar, “Effective Discretization and Hybrid Feature Selection Using Naïve Bayesian Classifier for Medical Datamining,” International Journal of Computational Intelligence Research, vol. 5(2), pp. 116-129, 2009.
[81] Y.T. Xu, L. Zhen, L.M. Yang, and L.S. Wang, “Classification Algorithm Based on Feature Selection and Samples Selection,” Lecture Notes in Computer Science, vol. 5552, pp. 631-638, 2009.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文