研究生: |
許錫弘 Hsu, Xi-Hong |
---|---|
論文名稱: |
正交多項式在深度算子神經網路的應用 Application of Orthogonal Polynomials in Deep Operator Network |
指導教授: |
汪上曉
Wong, Shan-Hill |
口試委員: |
姚遠
Yao, Yuan 康嘉麟 Kang, Jia-Lin 葉廷仁 Yeh, Ting-Jen |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 化學工程學系 Department of Chemical Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 40 |
中文關鍵詞: | 分佈式參數系統 、動態預測 、參數回歸 、物理訊息神經網路 、深度運算子神經網路 |
外文關鍵詞: | Distributed Parameter System, Deep Operator Network |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
分佈式參數系統 (DPS) 的模型對於物理、生物以及工程系統中的設計和控制極為重要。這種系統可以用一個邊界條件固定但操作值函數和初始函數可變的偏微分方程表示:y\left(z,t\right)=f\left(z,t,y\left(z,0\right),u\left(z,t\right)\right),此方程亦可稱為輸入函數為操作值函數和初始函數的泛函。
本研究中,我們建立DeepONet神經網路,並基於其結構發展出另外兩種以正交多項式逼近的結構,對分佈式參數變溫系統進行建模,並以不同訓練數據進行動態預測以及以後者模型進行參數預測。
在動態預測中,我們由DeepONet神經網路的訓練結果可以發現,當其為「全深度神經網路」時,模型訓練結果不甚理想,而「主幹及分支為深度神經網路」的預測結果隨著訓練樣本數的增加而有所提升,然而「僅分支網路為深度神經網路」的表現卻能比前者來的更好,給予500組訓練樣本數,其預測R2中位數甚至能達到0.99;而參數預測中,我們發現改變不同損失值的權重會影響參數預測的準確性,然而當權重失衡時會導致預測準確急遽下降,而此模型在參數預測最好的情況也能達到誤差值0.09以下。
總而言之,藉由上述實驗,我們驗證深度算子神經網路(DeepONet)對於分布式參數系統能有很好的建模能力,且正交函數替代主幹網路亦能對模型訓練結果及參數預測有正面的提升。
Model representation of a distributed parameter system (DPS) is extremely important for design and control in process system engineering. Such system can be represented by a partial differential equation with fixed boundary condition, but variable activation function and initial condition: y\left(z,t\right)=f\left(z,t,y\left(z,0\right),u\left(z,t\right)\right). Such a system can be referred to as an operator on functions y\left(z,0\right) and u\left(z,t\right); or a functional of z, and t.
It has been well-known in machine learning that an artificial neural network is an universal approximator. Lu et al 2021 demonstrated that artificial neural network can also be an universal “functional” or “operator” approximator and that the training can be much facilitated by dot product of a “branch-net” and a “trunk-net”. The “branch-net” received input from initial conditions and the input to the “trunk-net” is the point of interests in space-time. A neural network representation of a DPS, known as DeepOnet can be developed by a data-driven approach, or a first principle approach, or a hybrid approach as propose in the physics informed neural network (PINN).
We found that DeepOnet is neural network implementation of PDE solution by orthogonal approximation of the functional/operator. Hence DeepONet can actually be divided into five subunits: space-trunk-net, time-trunk-net, initial condition-trunk-net and activation trunk-net and a branch-net. While all of them can be left as a free form and trained for a specific problem, each of these trunknets can be pretrained using orthogonal polynomials for a specific space-time domain. Only the parameters of the branch-net must be trained for the specific problem.
The simplified approach was demonstrated the forward solution of a heat transfer problem studied by Gay and Ray 1995, and showed that using orthogonal DeepONet, identification and solution of the DPS can be much simplified.
[1] Terrence J. Sejnowski. The unreasonable effectiveness of deep learning in artificial intelligence. PNAS, 117:30033— -30038, 2020
[2] Qin, T.; Wu, K.; Xiu, D. Data Driven Governing Equations Approximation Using Deep Neural Networks. J. Comput. Phys. 2019, 395, 620–635.
[3] Long, Z.; Lu, Y.; Ma, X.; Dong, B. PDE-Net: Learning PDEs from Data. arXiv Prepr. arXiv1710.09668 2017.
[4] McCulloch, W.S., Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 115-133(1943).
[5] 許恆修(2019, Jun) Medium 何謂 Artificial Neural Network? - Retrieved July, 2022, from https://r23456999.medium.com/%E4%BD%95%E8%AC%82-artificial-neural-netwrok-33c546c94794
[6] Y.LeCun, L.Bottou, Y.Bengio, P.Haffner, (1998) Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86, 2278-2324.
[7] V.Nair, G.E.Hinton, (2010), Rectified linear units improve restricted boltzmann machines, In ICML.
[8] J.Han, C.Moraga, (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning, In International Workshop on Artificial Neural Networks, 195-201.
[9] D.P.Kingma, J.Ba, (2014) Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980.
[10] Montavon, G., Samek, W., & Müller, K. R. (2018). Methods for interpreting and understanding deep neural networks. Digital signal processing, 73, 1-15.
[11] Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4), 303-314
[12] Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017, August). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET) (pp. 1-6). Ieee.
[13] Lukoševičius, M., & Jaeger, H. (2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3), 127-149.
[14] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.
[15] Massimo Merenda(2020, April) ResearchGate - Edge Machine Learning for AI-Enabled IoT Devices: A Review - Retrieved July, 2022, from https://www.researchgate.net/figure/Deep-Neural-Network-DNN-example_fig2_341037496
[16] Tommy Huang(2018, May) Medium 卷積神經網路(Convolutional neural network, CNN) - Retrieved July, 2022, from https://chih-sheng-huang821.medium.com/%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF-convolutional-neural-network-cnn-cnn%E9%81%8B%E7%AE%97%E6%B5%81%E7%A8%8B-ecaec240a631
[17] Weijiang Feng(2017, May) ResearchGate - Audio visual speech recognition with multimodal recurrent neural networks - Retrieved July, 2022, from https://www.researchgate.net/figure/The-standard-RNN-and-unfolded-RNN_fig1_318332317
[18] Soohwan Kim(2021, Jan) Github software/seq2seq:PyTorch implementation of the RNN-basd sequence-to-sequence architecture - Retrieved July, 2022, from https://github.com/sooftware/seq2seq
[19] Raissi, M. (2018). Deep hidden physics models: Deep learning of nonlinear partial differential equations. The Journal of Machine Learning Research, 19(1), 932-955.
[20] Chen, D., Gao, X., Xu, C., Wang, S., Chen, S., Fang, J., & Wang, Z. (2022). FlowDNN: a physics-informed deep neural network for fast and accurate flow prediction. Frontiers of Information Technology & Electronic Engineering, 23(2), 207-219.
[21] Chen, M., Lupoiu, R., Mao, C., Huang, D. H., Jiang, J., Lalanne, P., & Fan, J. (2021). Physics-augmented deep learning for high-speed electromagnetic simulation and optimization.
[22] Yang, L., Meng, X., & Karniadakis, G. E. (2021). B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. Journal of Computational Physics, 425, 109913.
[23] Lu, L., Jin, P., & Karniadakis, G. E. (2019). Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193.
[24] Gay, D. H., & Ray, W. H. (1995). Identification and control of distributed parameter systems by means of the singular value decomposition. Chemical Engineering Science, 50(10), 1519-1539.