應用改良式簡化群體演算法改善圖卷積神經網路之結構(NAS)與超參數(HPO)

簡易檢索 / 詳目顯示

回結果列表

研究生：	林子傑 Lin, Zi-Jie
論文名稱：	應用改良式簡化群體演算法改善圖卷積神經網路之結構(NAS)與超參數(HPO) Using improved SSO to optimize Graph Convolution Network neural architecture search and hyperparameter optimization
指導教授：	葉維彰 Yeh, Wei-Chang
口試委員:	謝宗融梁韵嘉賴智明
學位類別：	碩士 Master
系所名稱：	工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management
論文出版年：	2024
畢業學年度：	112
語文別：	中文
論文頁數：	75
中文關鍵詞：	圖卷積神經網路、簡化群體演算法、改良式簡化群體演算法、改良式簡化群體演算法、超參數優化、神經網路搜索
外文關鍵詞：	GCN, Simplified Swarm Optimization, Improved Simplified Swarm Optimization, Hyperparameter Optimization, Neural Architecture Search
相關次數：	點閱：48 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

圖是現代中，較為複雜的資料結構，由節點和邊組成，用於表示實體之間的關係。隨著深度學習的崛起，人們開始關注如何將神經網路應用於圖。而圖神經網路 (Graph Neural Network，GNN)的特色包括:迭代式更新節點表示、利用鄰接訊息與對圖結構進行學習等。
GNN最重要的突破之一是圖卷積神經網路（Graph Convolutional Network，GCN）的提出，它允許神經網路將卷積用於圖，能有效處理圖，並在社交網路、推薦系統、生物訊息學等領域實現了卓越的性能。
本研究旨在透過同時使用簡化群體演算法（Simplified Swarm Optimization，SSO)與改良式簡化群體演算法（Improved SSO，iSSO），對GCN進行超參數優化與神經網路搜索，亦能將連續型與離散型等不同特性編碼的參數進行自動搜索。以往在不同資料集與環境使用GCN時，需要經過數千數萬次反覆嘗試才能找出合理並較有競爭力的超參數與網路結構組合，但若沒有把優化器、激勵函數與更新式類型等神經網路結構優化部分考慮進搜索範圍，且參數搜索範圍或方法過於簡單時，直接應用於GCN上成效也有限。本研究利用MSSO（Mixed SSO）:也就是同時使用SSO與iSSO優化整體GCN的結構與超參數，使用者不必手動調整超參數與複雜的網路結構，就能讓其準確度在不同資料集都能達到甚至超越以往論文的水準，不僅如此，本研究亦可廣泛應用於其他圖資料集，甚至結果相較原先的GCN方法與相比，所有資料集的表現都高了1%以上，有了明顯的提升。

關鍵詞: 圖卷積神經網路、簡化群體演算法、改良式簡化群體演算法、超參數優化、神經網路搜索

Graphs, as modern data structures, are more complex, composed of nodes and edges, used to represent relationships between entities. With the rise of deep learning, attention has turned towards applying neural networks to graphs. Graph Neural Networks(GNN) possess distinctive features, including iteratively updating node representations, utilizing message passing among adjacent nodes, and learning the structure of the graph itself.
One of the most significant breakthroughs in Graph Neural Networks (GNN) is the introduction of Graph Convolutional Networks (GCN). GCN allow neural networks to effectively process graph data, achieving remarkable performance in various fields such as social networks, recommendation systems, and bioinformatics.
This study aims to automatically search parameter groups, encoding different characteristics such as continuous and discrete types, in hyperparameter and neural network searches using simplified swarm optimization(SSO) and improved simplified swarm optimization(iSSO) In the past, achieving reasonable and competitive combinations of hyperparameters and network structures for GCN across different datasets and environments required numerous iterations, often numbering in the thousands or even millions. This study employs MSSO:SSO and iSSO to simultaneously optimize the overall structure and hyperparameters of GCN. Users are relieved from the burden of manually adjusting hyperparameters and intricate network structures. Consequently, the accuracy of GCN surpasses the standards set by previous literature across numerous datasets.

Keyword: GCN,Simplified Swarm Optimization,Improved Simplified Swarm Optimization,Hyperparameter Optimization,Neural Architecture Search

目錄
摘要    i
Abstract    ii
致謝辭    iii
目錄     iv
第一章 緒論    1
1.1    研究背景    1
1.2    研究動機    3
1.3    研究目的與重要性    4
1.4    論文結構    5
第二章 文獻回顧    7
2.1    圖神經網路    7
2.2    圖卷積神經網路    8
2.3    早停法    16
2.4    GCNII    17
2.5    神經網路搜索    19
2.6    超參數優化    20
2.7    簡化群體演算法與改良式簡化群體演算法    22
2.8    文獻回顧小結    24
第三章 研究架構    26
3.1    解編碼方式    26
3.2    模型評估與適應度函數    31
3.3    SSO+iSSO優化NAS+HPO    32
3.4    早停策略    36
3.5    方法架構與流程    37
第四章 實驗結果    39
4.1    實驗設定    39
4.2    前置實驗    42
4.3    MSSO參數實驗    43
4.4    早停策略之功效    47
4.5    演算法比較    48
4.6    實驗結果    51
第五章 結論與未來方向    62
5.1    結論    62
5.2    未來研究方向    63




表目錄
表1- 1:重要名詞縮寫    3
表2-1:GCN於各資料集表現    10
表2-2:GCNII對層數與變體的探討    18
表3-1:演算法解變數代表之結構或參數與範圍    31
表3-2:加入常數後更新值的修正    33
表3-3:各變數定義    34
表3- 4:實驗虛擬碼    35
表4-1:各主要資料集內容    41
表4-2:各次要資料集內容    41
表4-3:主要與次要資料集差異    41
表4-4:不同優化器在各主要資料集之準確度    42
表4-5:不同激勵函數在各主要資料集之準確度    43
表4-6:SSO(左)與iSSO(右)各實驗水準之參數設計    44
表4-7:各實驗水準之ANOVA檢定    47
表4-8:各參數組合準確度    47
表4-9:GCNII中適應度函數範圍    48
表4-10主要資料集實驗所需時間    48
表4-11:次要資料集實驗所需時間    48
表4- 12:各演算法於優化問題之複雜度    49
表4-13:各演算法參數設定    50
表4-14:各演算法於主要資料集之準確度表現    50
表4-15:各演算法於次要資料集之準確度表現    50
表4-16:實驗設置之演算法超參數    51
表4-17:原論文主要資料集結構與參數設定    52
表4-18:MSSO優化後主要資料集結構與參數設定    52
表4-19:原論文次要資料集結構與參數設定    52
表4- 20:MSSO優化後次要資料集結構與參數設定    53
表4- 21:MSSO優化前後主要資料集之準確度表現    55
表4- 22:MSSO優化前後次要資料集之準確度表現    55
表4- 23:主要資料集ANOVA分析    57
表4- 24:次要資料集ANOVA分析    57
表4- 25:本研究結果於主要資料集與其他節點分類論文結果之比較    59
表4- 26:本研究結果於次要資料集與其他節點分類論文結果之比較    60
表4- 27:復現具競爭力模型比較    60

 
圖目錄
圖1-1:論文架構    6
圖2- 1:圖(GRAPH)結構示意圖    7
圖2-2:圖卷積神經網路示意圖    9
圖2-3:GCN各層示意圖    10
圖2-4:卷積更新機制    11
圖2-5:GCN更新矩陣    12
圖2-6:不同激勵函數示意圖    13
圖2-7:早停法示意圖    16
圖2-8:單層GCN(a)與GCNII(b)架構    17
圖3-1:本研究NAS參數編碼    28
圖3-2:本研究HPO參數編碼    30
圖3- 3:本研究離散型參數編碼方式    30
圖3-4:本研究連續型參數編碼方式    30
圖3-5:使用早停法前適應度函數計算    36
圖3-6:使用早停法後適應度函數計算    37
圖3- 7:實驗流程圖    38
圖4- 1:Cora資料集視覺化    39
圖4-2:不同迭代(左)與解個數(右)對準確度的影響    44
圖4-3:常態檢定結果    45
圖4-4:不同參數組合同質性檢定    46
圖4-5:不同參數組合獨立性檢定    46
圖4- 6:各資料集優化前後30樣本之準確度盒狀圖    56
圖4- 7:MSSO對主要與次要資料集之主效應圖    57
圖4-8:各參數配置對準確度的差異    58
圖4-9:各參數配置對時間的差異    58


                                

[1] H. Knebl, "Algorithms and data structures," Cham: Springer Nature Switzerland AG, 2020.
[2] F. AS and T. B. Antony, "Spectral graph theory," 2023.
[3] F. Zhou, Q. Yang, T. Zhong, D. Chen, and N. Zhang, "Variational graph neural networks for road traffic prediction in intelligent transportation systems," IEEE Transactions on Industrial Informatics, vol. 17, no. 4, pp. 2802-2812, 2020.
[4] S. Casale-Brunet, P. Ribeca, P. Doyle, and M. Mattavelli, "Networks of Ethereum Non-Fungible Tokens: A graph-based analysis of the ERC-721 ecosystem," in 2021 IEEE International Conference on Blockchain (Blockchain), 2021: IEEE, pp. 188-195.
[5] J. Chang et al., "Sequential recommendation with graph neural networks," in Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, 2021, pp. 378-387.
[6] X. Liu, J. Chen, and Q. Wen, "A Survey on Graph Classification and Link Prediction based on GNN," arXiv preprint arXiv:2307.00865, 2023.
[7] N. N. Daud, S. H. Ab Hamid, M. Saadoon, F. Sahran, and N. B. Anuar, "Applications of link prediction in social networks: A review," Journal of Network and Computer Applications, vol. 166, p. 102716, 2020.
[8] S. Xiao, S. Wang, Y. Dai, and W. Guo, "Graph neural networks in node classification: survey and evaluation," Machine Vision and Applications, vol. 33, no. 1, p. 4, 2022.
[9] K. Shah, H. Patel, D. Sanghvi, and M. Shah, "A comparative analysis of logistic regression, random forest and KNN models for the text classification," Augmented Human Research, vol. 5, no. 1, p. 12, 2020.
[10] W. Lotter, G. Kreiman, and D. Cox, "A neural network trained for prediction mimics diverse features of biological neurons and perception," Nature machine intelligence, vol. 2, no. 4, pp. 210-219, 2020.
[11] F. Zhou and Y. Chai, "Near-sensor and in-sensor computing," Nature Electronics, vol. 3, no. 11, pp. 664-671, 2020.
[12] T. P. Lillicrap, A. Santoro, L. Marris, C. J. Akerman, and G. Hinton, "Backpropagation and the brain," Nature Reviews Neuroscience, vol. 21, no. 6, pp. 335-346, 2020.
[13] D. Bhatt et al., "CNN variants for computer vision: History, architecture, application, challenges and future scope," Electronics, vol. 10, no. 20, p. 2470, 2021.
[14] F. Emmert-Streib, Z. Yang, H. Feng, S. Tripathi, and M. Dehmer, "An introductory review of deep learning for prediction models with big data," Frontiers in Artificial Intelligence, vol. 3, p. 4, 2020.
[15] A. Egri-Nagy and A. Törmänen, "The game is not over yet—go in the post-alphago era," Philosophies, vol. 5, no. 4, p. 37, 2020.
[16] L. Floridi and M. Chiriatti, "GPT-3: Its nature, scope, limits, and consequences," Minds and Machines, vol. 30, pp. 681-694, 2020.
[17] Z. Liu and J. Zhou, Introduction to graph neural networks. Springer Nature, 2022.
[18] W. Fan et al., "A graph neural network framework for social recommendations," IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 5, pp. 2033-2047, 2020.
[19] W. Jiang and J. Luo, "Graph neural network for traffic forecasting: A survey," Expert Systems with Applications, vol. 207, p. 117921, 2022.
[20] S. Wu, F. Sun, W. Zhang, X. Xie, and B. Cui, "Graph neural networks in recommender systems: a survey," ACM Computing Surveys, vol. 55, no. 5, pp. 1-37, 2022.
[21] I. Ullah, M. Manzo, M. Shah, and M. G. Madden, "Graph convolutional networks: analysis, improvements and results," Applied Intelligence, pp. 1-12, 2022.
[22] T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," arXiv preprint arXiv:1609.02907, 2016.
[23] S. Zhang, H. Tong, J. Xu, and R. Maciejewski, "Graph convolutional networks: a comprehensive review," Computational Social Networks, vol. 6, no. 1, pp. 1-23, 2019.
[24] M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, "Simple and deep graph convolutional networks," in International conference on machine learning, 2020: PMLR, pp. 1725-1735.
[25] F. Wang, H. Zhang, and A. Zhou, "A particle swarm optimization algorithm for mixed-variable optimization problems," Swarm and Evolutionary Computation, vol. 60, p. 100808, 2021.
[26] T. Yu and H. Zhu, "Hyper-parameter optimization: A review of algorithms and applications," arXiv preprint arXiv:2003.05689, 2020.
[27] P. Ren et al., "A comprehensive survey of neural architecture search: Challenges and solutions," ACM Computing Surveys (CSUR), vol. 54, no. 4, pp. 1-34, 2021.
[28] U. A. Bhatti, H. Tang, G. Wu, S. Marjan, and A. Hussain, "Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence," International Journal of Intelligent Systems, vol. 2023, no. 1, p. 8342104, 2023.
[29] L. Yang and A. Shami, "On hyperparameter optimization of machine learning algorithms: Theory and practice," Neurocomputing, vol. 415, pp. 295-316, 2020.
[30] Y. Liu, Y. Sun, B. Xue, M. Zhang, G. G. Yen, and K. C. Tan, "A survey on evolutionary neural architecture search," IEEE transactions on neural networks and learning systems, vol. 34, no. 2, pp. 550-570, 2021.
[31] C.-F. Chung, "Optimization Convolutional Neural Network Architecture and Hyperparameter Using Simplified Swarm Optimization," 2020.
[32] M. M. Peeters et al., "Hybrid collective intelligence in a human–AI society," AI & society, vol. 36, pp. 217-238, 2021.
[33] D. Ha and Y. Tang, "Collective intelligence for deep learning: A survey of recent developments," Collective Intelligence, vol. 1, no. 1, p. 26339137221114874, 2022.
[34] S. Mirjalili, "Genetic algorithm," Evolutionary Algorithms and Neural Networks: Theory and Applications, pp. 43-55, 2019.
[35] J. Fang, W. Liu, L. Chen, S. Lauria, A. Miron, and X. Liu, "A survey of algorithms, applications and trends for particle swarm optimization," International Journal of Network Dynamics and Intelligence, pp. 24-50, 2023.
[36] L. Rice, E. Wong, and Z. Kolter, "Overfitting in adversarially robust deep learning," in International Conference on Machine Learning, 2020: PMLR, pp. 8093-8104.
[37] A. Sherstinsky, "Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network," Physica D: Nonlinear Phenomena, vol. 404, p. 132306, 2020.
[38] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, "A comprehensive survey on graph neural networks," IEEE transactions on neural networks and learning systems, vol. 32, no. 1, pp. 4-24, 2020.
[39] M. Pfluger, D. T. Cucala, and E. V. Kostylev, "Recurrent Graph Neural Networks and Their Connections to Bisimulation and Logic," in Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 13, pp. 14608-14616.
[40] H. Zhang, P. Li, R. Zhang, and X. Li, "Embedding graph auto-encoder for graph clustering," IEEE Transactions on Neural Networks and Learning Systems, 2022.
[41] G. Jin, M. Wang, J. Zhang, H. Sha, and J. Huang, "STGNN-TTE: Travel time estimation via spatial–temporal graph neural network," Future Generation Computer Systems, vol. 126, pp. 70-81, 2022.
[42] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, "Spectral networks and locally connected networks on graphs," arXiv preprint arXiv:1312.6203, 2013.
[43] O. S. Kayhan and J. C. v. Gemert, "On translation invariance in cnns: Convolutional layers can exploit absolute spatial location," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14274-14285.
[44] A. Jahanbani, S. M. Sheikholeslami, and R. Khoeilar, "On the spectrum of laplacian matrix," Mathematical Problems in Engineering, vol. 2021, pp. 1-4, 2021.
[45] A. Gasmi, "What is fast fourier transform?," Société Francophone de Nutrithérapie et de Nutrigénétique Appliquée, 2022.
[46] M. Defferrard, X. Bresson, and P. Vandergheynst, "Convolutional neural networks on graphs with fast localized spectral filtering," Advances in neural information processing systems, vol. 29, 2016.
[47] T. J. Rivlin, Chebyshev polynomials. Courier Dover Publications, 2020.
[48] R. Zeng et al., "Graph convolutional networks for temporal action localization," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 7094-7103.
[49] B. Jin, C. Gao, X. He, D. Jin, and Y. Li, "Multi-behavior recommendation with graph convolutional networks," in Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, 2020, pp. 659-668.
[50] Y. Huang, S. Wuchty, Y. Zhou, and Z. Zhang, "SGPPI: structure-aware prediction of protein–protein interactions in rigorous conditions with graph convolutional network," Briefings in Bioinformatics, vol. 24, no. 2, p. bbad020, 2023.
[51] S. Harada et al., "Dual graph convolutional neural network for predicting chemical networks," BMC bioinformatics, vol. 21, pp. 1-13, 2020.
[52] Z. Yang, W. Cohen, and R. Salakhudinov, "Revisiting semi-supervised learning with graph embeddings," in International conference on machine learning, 2016: PMLR, pp. 40-48.
[53] Y. Ouali, C. Hudelot, and M. Tami, "An overview of deep semi-supervised learning," arXiv preprint arXiv:2006.05278, 2020.
[54] F. Hu, Y. Zhu, S. Wu, W. Huang, L. Wang, and T. Tan, "Graphair: Graph representation learning with neighborhood aggregation and interaction," Pattern Recognition, vol. 112, p. 107745, 2021.
[55] F. Hu, Y. Zhu, S. Wu, L. Wang, and T. Tan, "Hierarchical graph convolutional networks for semi-supervised node classification," arXiv preprint arXiv:1902.06667, 2019.
[56] Y. Xu, Q. Qian, H. Li, and R. Jin, "Why Does Multi-Epoch Training Help?," arXiv preprint arXiv:2105.06015, 2021.
[57] C. F. G. D. Santos and J. P. Papa, "Avoiding overfitting: A survey on regularization methods for convolutional neural networks," ACM Computing Surveys (CSUR), vol. 54, no. 10s, pp. 1-25, 2022.
[58] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, "A survey of convolutional neural networks: analysis, applications, and prospects," IEEE transactions on neural networks and learning systems, 2021.
[59] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[60] W. L. Hamilton, Graph representation learning. Morgan & Claypool Publishers, 2020.
[61] T. Szandała, "Review and comparison of commonly used activation functions for deep neural networks," Bio-inspired neurocomputing, pp. 203-224, 2021.
[62] H. Pratiwi et al., "Sigmoid activation function in selecting the best model of artificial neural networks," in Journal of Physics: Conference Series, 2020, vol. 1471, no. 1: IOP Publishing, p. 012010.
[63] D. Kim, J. Kim, and J. Kim, "Elastic exponential linear units for convolutional neural networks," Neurocomputing, vol. 406, pp. 253-266, 2020.
[64] Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, "Efficient backprop," in Neural networks: Tricks of the trade: Springer, 2002, pp. 9-50.
[65] J. He, L. Li, and J. Xu, "Approximation properties of deep ReLU CNNs," Research in the mathematical sciences, vol. 9, no. 3, p. 38, 2022.
[66] Y. Gao, W. Liu, and F. Lombardi, "Design and implementation of an approximate softmax layer for deep neural networks," in 2020 IEEE international symposium on circuits and systems (ISCAS), 2020: IEEE, pp. 1-5.
[67] Y. Gao, Z. Song, and S. Xie, "In-context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick," arXiv preprint arXiv:2307.02419, 2023.
[68] S. L. Smith, B. Dherin, D. G. Barrett, and S. De, "On the origin of implicit regularization in stochastic gradient descent," arXiv preprint arXiv:2101.12176, 2021.
[69] A. Cutkosky and H. Mehta, "Momentum improves normalized sgd," in International conference on machine learning, 2020: PMLR, pp. 2260-2268.
[70] A. Lewkowycz, Y. Bahri, E. Dyer, J. Sohl-Dickstein, and G. Gur-Ari, "The large learning rate phase of deep learning: the catapult mechanism," arXiv preprint arXiv:2003.02218, 2020.
[71] Z. Liu, T. D. Nguyen, A. Ene, and H. Nguyen, "On the Convergence of AdaGrad (Norm) on R^ d: Beyond Convexity, Non-Asymptotic Rate and Acceleration," in International Conference on Learning Representations, 2023: International Conference on Learning Representations.
[72] D. Xu, S. Zhang, H. Zhang, and D. P. Mandic, "Convergence of the RMSProp deep learning method with penalty for nonconvex optimization," Neural Networks, vol. 139, pp. 17-23, 2021.
[73] S. H. Haji and A. M. Abdulazeez, "Comparison of optimization techniques based on gradient descent algorithm: A review," PalArch's Journal of Archaeology of Egypt/Egyptology, vol. 18, no. 4, pp. 2715-2743, 2021.
[74] L. Guan, "Weight prediction boosts the convergence of adamw," in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2023: Springer, pp. 329-340.
[75] M. Li, M. Soltanolkotabi, and S. Oymak, "Gradient descent with early stopping is provably robust to label noise for overparameterized neural networks," in International conference on artificial intelligence and statistics, 2020: PMLR, pp. 4313-4324.
[76] N. Li, L. Ma, G. Yu, B. Xue, M. Zhang, and Y. Jin, "Survey on evolutionary deep learning: Principles, algorithms, applications, and open issues," ACM Computing Surveys, vol. 56, no. 2, pp. 1-34, 2023.
[77] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[78] T. Elsken, J. H. Metzen, and F. Hutter, "Neural architecture search: A survey," The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997-2017, 2019.
[79] M. Uzair and N. Jamil, "Effects of hidden layers on the efficiency of neural networks," in 2020 IEEE 23rd international multitopic conference (INMIC), 2020: IEEE, pp. 1-6.
[80] M. Kaveh and M. S. Mesgari, "Application of meta-heuristic algorithms for training neural networks and deep learning architectures: A comprehensive review," Neural Processing Letters, vol. 55, no. 4, pp. 4519-4622, 2023.
[81] K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. P. Xing, "Neural architecture search with bayesian optimisation and optimal transport," Advances in neural information processing systems, vol. 31, 2018.
[82] R. Zhu, Z. Tao, Y. Li, and S. Li, "Automated graph learning via population based self-tuning GCN," in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 2096-2100.
[83] M. Shi et al., "Genetic-gnn: Evolutionary architecture search for graph neural networks," Knowledge-Based Systems, vol. 247, p. 108752, 2022.
[84] K. Zhou, X. Huang, Q. Song, R. Chen, and X. Hu, "Auto-gnn: Neural architecture search of graph neural networks," Frontiers in big Data, vol. 5, p. 1029307, 2022.
[85] M. Feurer and F. Hutter, "Hyperparameter optimization," Automated machine learning: Methods, systems, challenges, pp. 3-33, 2019.
[86] A. Mathew, P. Amudha, and S. Sivakumari, "Deep learning techniques: an overview," Advanced Machine Learning Technologies and Applications: Proceedings of AMLTA 2020, pp. 599-608, 2021.
[87] X. Jia, X. Feng, H. Yong, and D. Meng, "Weight Decay With Tailored Adam on Scale-Invariant Weights for Better Generalization," IEEE Transactions on Neural Networks and Learning Systems, 2022.
[88] S. K. Kumar, "On weight initialization in deep neural networks," arXiv preprint arXiv:1704.08863, 2017.
[89] Z. Liu, Z. Xu, J. Jin, Z. Shen, and T. Darrell, "Dropout reduces underfitting," in International Conference on Machine Learning, 2023: PMLR, pp. 22233-22248.
[90] K. Roth, T. Milbich, S. Sinha, P. Gupta, B. Ommer, and J. P. Cohen, "Revisiting training strategies and generalization performance in deep metric learning," in International Conference on Machine Learning, 2020: PMLR, pp. 8242-8252.
[91] Z. Fouad, M. Alfonse, M. Roushdy, and A.-B. M. Salem, "Hyper-parameter optimization of convolutional neural network based on particle swarm optimization algorithm," Bulletin of Electrical Engineering and Informatics, vol. 10, no. 6, pp. 3377-3384, 2021.
[92] N. M. Aszemi and P. Dominic, "Hyperparameter optimization in convolutional neural network using genetic algorithms," International Journal of Advanced Computer Science and Applications, vol. 10, no. 6, 2019.
[93] W.-C. Yeh, Y.-P. Lin, Y.-C. Liang, C.-M. Lai, and C.-L. Huang, "Simplified swarm optimization for hyperparameters of convolutional neural networks," Computers & Industrial Engineering, vol. 177, p. 109076, 2023.
[94] W.-C. Yeh, "A two-stage discrete particle swarm optimization for the problem of multiple multi-level redundancy allocation in series systems," Expert Systems with Applications, vol. 36, no. 5, pp. 9192-9200, 2009.
[95] T. Kameda, W. Toyokawa, and R. S. Tindale, "Information aggregation and collective intelligence beyond the wisdom of crowds," Nature Reviews Psychology, vol. 1, no. 6, pp. 345-357, 2022.
[96] A. W. Woolley, I. Aggarwal, and T. W. Malone, "Collective intelligence and group performance," Current Directions in Psychological Science, vol. 24, no. 6, pp. 420-424, 2015.
[97] W.-N. Chen, J. Zhang, H. S. Chung, W.-L. Zhong, W.-G. Wu, and Y.-h. Shi, "A novel set-based particle swarm optimization method for discrete optimization problems," IEEE Transactions on evolutionary computation, vol. 14, no. 2, pp. 278-300, 2009.
[98] W.-C. Yeh, "An improved simplified swarm optimization," Knowledge-Based Systems, vol. 82, pp. 60-69, 2015.
[99] W.-C. Yeh, Y.-Z. Su, X.-Z. Gao, C.-F. Hu, J. Wang, and C.-L. Huang, "Simplified swarm optimization for bi-objection active reliability redundancy allocation problems," Applied Soft Computing, vol. 106, p. 107321, 2021.
[100] C.-M. Lai, W.-C. Yeh, and C.-Y. Chang, "Gene selection using information gain and improved simplified swarm optimization," Neurocomputing, vol. 218, pp. 331-338, 2016.
[101] C. Garbin, X. Zhu, and O. Marques, "Dropout vs. batch normalization: an empirical study of their impact to deep learning," Multimedia Tools and Applications, vol. 79, pp. 12777-12815, 2020.
[102] E. Lobacheva, M. Kodryan, N. Chirkova, A. Malinin, and D. P. Vetrov, "On the periodic behavior of neural network training with batch normalization and weight decay," Advances in Neural Information Processing Systems, vol. 34, pp. 21545-21556, 2021.
[103] Y. Gao, H. Yang, P. Zhang, C. Zhou, and Y. Hu, "Graph neural architecture search," in International joint conference on artificial intelligence, 2021: International Joint Conference on Artificial Intelligence.
[104] Y. Rong, W. Huang, T. Xu, and J. Huang, "Dropedge: Towards deep graph convolutional networks on node classification," arXiv preprint arXiv:1907.10903, 2019.
[105] Z. Lu et al., "Nsga-net: neural architecture search using multi-objective genetic algorithm," in Proceedings of the genetic and evolutionary computation conference, 2019, pp. 419-427.
[106] T. Liao, K. Socha, M. A. M. de Oca, T. Stützle, and M. Dorigo, "Ant colony optimization for mixed-variable optimization problems," IEEE Transactions on evolutionary computation, vol. 18, no. 4, pp. 503-518, 2013.
[107] Y. Jin and J. Branke, "Evolutionary optimization in uncertain environments-a survey," IEEE Transactions on evolutionary computation, vol. 9, no. 3, pp. 303-317, 2005.
[108] M. Couceiro, P. Ghamisi, M. Couceiro, and P. Ghamisi, Particle swarm optimization. Springer, 2016.
[109] Y. Luo, A. Chen, K. Yan, and L. Tian, "Distilling self-knowledge from contrastive links to classify graph nodes without passing messages," arXiv preprint arXiv:2106.08541, 2021.
[110] M. R. Izadi, Y. Fang, R. Stevenson, and L. Lin, "Optimization of graph neural networks with natural gradient descent," in 2020 IEEE international conference on big data (big data), 2020: IEEE, pp. 171-179.
[111] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, "Graph attention networks," arXiv preprint arXiv:1710.10903, 2017.
[112] Y. Luo, G. Luo, K. Qin, and A. Chen, "Graph Entropy Minimization for Semi-supervised Node Classification," arXiv preprint arXiv:2305.19502, 2023.
[113] Y. Luo, R. Ji, T. Guan, J. Yu, P. Liu, and Y. Yang, "Every node counts: Self-ensembling graph convolutional networks for semi-supervised learning," Pattern Recognition, vol. 106, p. 107451, 2020.
[114] H. Zhu and P. Koniusz, "Simple spectral graph convolution," in International conference on learning representations, 2020.
[115] T. Zhao, Y. Liu, L. Neves, O. Woodford, M. Jiang, and N. Shah, "Data augmentation for graph neural networks," in Proceedings of the aaai conference on artificial intelligence, 2021, vol. 35, no. 12, pp. 11015-11023.
[116] V. Verma et al., "Graphmix: Improved training of gnns for semi-supervised learning," in Proceedings of the AAAI conference on artificial intelligence, 2021, vol. 35, no. 11, pp. 10024-10032.
[117] H. Pei, B. Wei, K. C.-C. Chang, Y. Lei, and B. Yang, "Geom-gcn: Geometric graph convolutional networks," arXiv preprint arXiv:2002.05287, 2020.

簡易檢索 / 詳目顯示

相關論文