研究生: |
葉力嘉 Yeh, Li-Chia |
---|---|
論文名稱: |
針對支撐向量機器中SMO有限疊代的嚴謹證明及核支撐向量機器之參數最佳化 A Rigorous Proof of Finite Iterations in SMO-SVM and Parameter Optimization in Kernel-Based SVM |
指導教授: |
呂忠津
Lu, Chung-Chin |
口試委員: |
陳博現
Chen, Bor-Sen 黃元豪 Huang, Yuan-Hao 馬席彬 Ma, Hsi-Pin 蘇賜麟 Su, Szu-Lin 林茂昭 Lin, Mao-Chao |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 76 |
中文關鍵詞: | 支撐向量機 、有限疊代 、超參數最佳化 、分類 、機器學習 、核矩陣 |
外文關鍵詞: | support vector machine, finite iteration, hyperparameter optimization, classification, machine learning, kernel matrix |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
支持向量機(SVM)是一種眾所周知的監督二進制分類器。但是,當數據集的大小很大時,很難一次性訓練所有數據以優化訓練的分類器。 1998年,Platt提出了一種迭代算法,順序最小優化(SMO),只選擇兩個拉格朗日乘數λ_i和λ_j來更新每次迭代,以便最小化計算成本。當使用SMO實現SVM時,我們稱之為SMO-SVM。 2001年晚些時候,Keerthi等人。通過將檢驗簡化為F_i-F_j≤τ的停止標準來改進Platt的SMO,其中τ> 0是非零容差。 Keerthi和Gilbert證明了SMO-SVM在2002年的有限迭代中終止。儘管一些研究人員在Keerthi和Gilbert的研究中構成了不充分證據的一部分,但他們的證明仍然基於拉格朗日乘數的假設下的漸近行為。在有限迭代的SMO-SVM中。然而,在停止標準F_i-F_j≤τ的情況下,這種假設是不正確的。在這項研究中,我們在Keerthi的SMO-SVM和Gilbert證明中給出了有限迭代的新的嚴格證明。我們還分析了超參數與測試錯誤率之間的關係。基於我們的發現,我們提出了小型核心驗證(miniCV),以快速篩選出優化的超參數組合,尤其適用於大型數據集。所提出的miniCV是一種參數優化方法,其完全僅建立在通過迭代SMO訓練過程生成的數據的分佈上。由於miniCV依賴於內核矩陣,因此可以避免交叉驗證,從而優化基於內核的SVM中的超參數。此外,通過我們用於跟踪測試性能的訓練相關變量的關鍵發現,miniCV能夠定位一個強大的超參數組合, 針對給定的訓練數據集。
Support vector machine (SVM) is a well-known supervised binary classifier. But, when the size of dataset is large, it is hard to train all data at once to optimize the trained classifier. In 1998, Platt proposed an iterative algorithm, the sequential minimal optimization (SMO), that only two Lagrangian multipliers λ_i and λ_j are selected to update for each iteration in order to minimize the computation cost. When SVM is implemented with SMO, we call SMO-SVM. Later in 2001, Keerthi et al. improved Platt’s SMO by simplifying the examination to the stopping criterion of F_i −F_j ≤ τ, where τ > 0 is a non-zero tolerance. Keerthi and Gilbert proved that the SMO-SVM terminates in finite iterations in 2002. Although some researchers made up part of the insufficient proof in Keerthi and Gilbert’s research, their proofs were still based on the asymptotic behavior of the Lagrangian multipliers under the assumption of an infinitely iterative SMO-SVM. However, such an assumption is incorrect under the stopping criterion F_i −F_j ≤ τ. In this research, we give a new and rigorous proof of finite iterations in SMO-SVM in Keerthi and Gilbert’s proof. We also analyze the relations between hyperparameters and the test error rate. Based on our discoveries, we propose mini core validation (miniCV) to fast screen out an optimized hyperparameter combination especially for large datasets. The proposed miniCV is a parameter optimization approach completely built only on the distribution of the data generated via the iterative SMO training process. Since miniCV depends on kernel matrix, it saves from cross-validation to optimize hyperparameters in kernel-based SVM. Moreover, with our key findings on a training related variable which is used to trace test performance, miniCV is able to locate a robust hyperparameter combination w.r.t. the given training dataset.
[1] V. N. Vapnik, The Nature of Statistical Learning Theory. Verlin, Heidelberg: Springer-Verlang, 1995.
[2] B. Sch¨olkopf, A. Smola, and K.-R. Mu¨ller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computation, vol. 10, no. 5, pp. 1299–1319, 1998. [Online]. Available: https://doi.org/10.1162/089976698300017467
[3] B. Sch¨olkopf, R. Herbrich, and A. J. Smola, “A generalized representer theorem,” in Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory, ser. COLT ’01/EuroCOLT ’01. London, UK, UK: Springer-Verlag, 2001, pp. 416–426. [Online]. Available: http://dl.acm.org/citation.cfm?id=648300.755324
[4] J. C. Platt, “A fast algorithm for training support vector machines,” Advances in Kernel Methods-Support Vector Learning, vol. 208, 07 1998.
[5] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy, “Improvements to platt’s smo algorithm for svm classifier design,” Neural Computation, vol. 13, no. 3, pp. 637–649, 3 2001.
[6] S. Keerthi and E. Gilbert, “Convergence of a generalized smo algorithm for svm classifier design,” Machine Learning, vol. 46, no. 1, pp. 351–360, Jan 2002. [Online]. Available: https://doi.org/10.1023/A:1012431217818
[7] N. Takahashi and T. Nishi, “Rigorous proof of termination of smo algorithm for support vector machines,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 774–776, May 2005.
[8] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml
[9] A. Tharwat, A. E. Hassanien, and B. E. Elnaghi, “A ba-based algorithm for parameter optimization of support vector machine,” Pattern Recognition Letters, vol. 93, pp. 13 – 22, 2017, pattern Recognition Techniques in Data Mining.
[10] P. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. London: GB: Prentice-Hall, 1982.
[11] S. Geisser, Predictive Inference. New York: NY: Chapman and Hall, 1993.
[12] R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, ser. IJCAI’95. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1995, pp. 1137–1143. [Online]. Available: http://dl.acm.org/citation.cfm?id=1643031.1643047
[13] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” Trans. Neur. Netw., vol. 13, no. 2, pp. 415–425, Mar. 2002. [Online]. Available: https://doi.org/10.1109/72.991427
[14] A. L. D. Rossi and A. C. P. L. F. d. Carvalho, “Bio-inspired optimization techniques for svm parameter tuning,” in 2008 10th Brazilian Symposium on Neural Networks, Oct 2008, pp. 57–62.
[15] S. Lessmann, R. Stahlbock, and S. F. Crone, “Genetic algorithms for support vector machine model selection,” in The 2006 IEEE International Joint Conference on Neural Network Proceedings, July 2006, pp. 3063–3069.
[16] B. F. De Souza, A. C. p. l. f. De Carvalho, R. Calvo, and R. P. Ishii, “Multiclass svm model selection using particle swarm optimization,” in 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS’06), Dec 2006, pp. 31–31.
[17] X. Zhang, X. Chen, and Z. He, “An aco-based algorithm for parameter optimization of support vector machines,” Expert Systems with Applications, vol. 37, no. 9, pp. 6618 – 6628, 2010. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0957417410002630
[18] I. Aydin, M. Karakose, and E. Akin, “A multi-objective artificial immune algorithm for parameter optimization in support vector machine,” Applied Soft Computing, vol. 11, no. 1, pp. 120 – 129, 2011. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1568494609002166
[19] R.-E. Fan, P.-H. Chen, and C.-J. Lin, “Working set selection using second order information for training support vector machines,” J. Mach. Learn. Res., vol. 6, pp. 1889–1918, Dec. 2005. [Online]. Available: http://dl.acm.org/citation.cfm?id=1046920.1194907
[20] U. von Luxburg, “A tutorial on spectral clustering,” CoRR, vol. abs/0711.0189, 2007. [Online]. Available: http://arxiv.org/abs/0711.0189
[21] C. J. B. Yann LeCun, Corinna Cortes, “The mnist database of handwritten digits,” http://http://yann.lecun.com/exdb/mnist/, accessed: 2018-11-30.
[22] B. Johnson, R. Tateishi, and N. Hoan, “A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees,” International Journal of Remote Sensing, vol. 34, pp. 6969–6982, 10 2013.
[23] R. J. Lyon, B. W. Stappers, S. Cooper, J. M. Brooke, and J. D. Knowles, “Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach,” Monthly Notices of the Royal Astronomical Society, vol. 459, no. 1, pp. 1104–1123, 04 2016. [Online]. Available: https://doi.org/10.1093/mnras/stw656
[24] M. A. U. H. Tahir, S. Asghar, A. Manzoor, and M. A. Noor, “A classification model for class imbalance dataset using genetic programming,” IEEE Access, vol. 7, pp. 71013–71037, 2019.
[25] T. M. Mohamed, “Pulsar selection using fuzzy knn classifier,” Future Computing and Informatics Journal, vol. 3, no. 1, pp. 1 – 6, 2018.