卷積-遞迴神經網路計算薛丁格方程的激發態能量

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳俊瑋 Chen, Jyun-Wei
論文名稱：	卷積-遞迴神經網路計算薛丁格方程的激發態能量 Excited-state energies of Schrodinger equation using CNN-RNN model
指導教授：	陳人豪 Chen, Jen-Hao
口試委員:	陳仁純 Chen, Ren-Chuen 劉晉良 Liu, Jinn-Liang
學位類別：	碩士 Master
系所名稱：	理學院 - 計算與建模科學研究所 Institute of Computational and Modeling Science
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	35
中文關鍵詞：	卷積神經網路、遞迴神經網路、長短期記憶網路、薛丁格方程、激發態能量
外文關鍵詞：	Convolutional neural network, Recurrent neural networks, Long short-term memory, Schrodinger equation, Excited state energy
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

我們針對薛丁格方程式的三種不同位能做機器學習的訓練，包含Simple harmonic oscillator，Double-well inverted Gaussians和Infinite well。
程式語言方面選擇的是Python撰寫，並且利用Tensorflow與Keras的套件包去對神經網路做修改。
在Kyle Mills(2017) 文中，他所提出的16層CNN對於單個輸出值有不錯的展現，但是同一種架構做多輸出訓練效果非常差，所以我們對其做修改及延伸。

我們在16層CNN架構後面加入了兩層雙向的LSTM，讓輸出欄位從一個增加至十個。
而訓練資料來源是藉由文章中所提出三種不同位能的方程，藉由有限差分法的方式做特徵值的計算。
每筆資料都是256x256的解析度對應10個能階(其中包含一個基態和9個激發態)。
在文獻提出的16層卷積層中，主要是以壓縮圖片解析度以達到逐層加密與壓縮的動作。
讓遞迴層對卷積層所傳下的值，做解密提取值的動作。

我們的模型特別的地方在於，
將二維位能看成是一張圖片，每一像素值即代表該位置的位能值，而每一張圖片(即位能)，能對應出多個標籤(即能量)，
而且這個標籤並不是明確的分類問題，所以並不是像手寫辨識那樣0跟1的差別。
而原文中所使用的優化器AdaDelta，我們也替換成現在所常見的Adam，而每層的輸出激活函數皆為ReLU。
新的模型架構可以使得訓練資料從原本的20萬降到最多6萬筆，但是預測能力與原本的CNN模型一樣甚至更好。
在所需資料降成三分之一，對於硬體計算量而言減少了許多，總訓練時間也大幅降低，對於硬體要求也同樣降低許多。

We construct a neural network model which can simultaneously predict the ground-state and many excited-state energies of the Schrodinger equation.
We consider three different kinds of potentials, including simple harmonic oscillators, double-well inverted Gaussians and infinite wells.
The tool used is Python, TensorFlow and Keras.
The training data is constructed by solving the eigenvalue problem of Schrodinger equation through the finite difference method.
The model architecture has multilayer convolutional neural network and multilayer bidirectional Long Short-Term Memory.
Image resolution is 256x256 grid points, and each grid point is a double precision number, which represents the value of potential at position.
The effect of convolutional neural network is encrypting and compressing layer by layer.
And we use recurrent layer to decrypt the features obtained from the convolutional neural network.
The main characteristic features of our model is that it is capable to capture multiple energy levels for a given potential.
In addition, we use the most common optimizer Adam of model and the output activation function of all layers is ReLU.
From our numerical experiments, it observes that we only need the fewer training data (20,000-60,000) to achieve
the required accuracy, while a larger data (200,000-400,000) for convolutional neural network.
The results show that the proposed neural network model has better accuracy of all computed energy levels compared with
the method in literature.

abstract---------------i
Acknowledgement---------------iv
Introduction---------------1
Neural network introduction---------------2
1 Machine learning---------------2
2 Artificial Neural Network---------------3
3 Convolutional Neural Network---------------4
4 Recurrent neural network---------------5
4.1 Bidirectional Recurrent Neural Network---------------5
4.2 Long Short Term Memory Network---------------7
5 Optimizer---------------8
Method---------------9
1 Finite Difference Method---------------9
1.1 Elliptic partial differential equation---------------10
2 Training set(single electron problem)---------------11
3 Neural network model---------------12
3.1 Convolutional neural network---------------12
3.2 BiLSTM---------------12
3.3 Model algorithm---------------13
Result---------------15
1 Example 1. Simple harmonic oscillators---------------18
2 Example 2. Double-well inverted Gaussians---------------22
3 Example 3. Infinite wells---------------26
Conclusion---------------30
Reference---------------32
Appendix---------------34
                                

[1] https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
[2] https://medium.com/@himadrisankarchatterjee/a-basic-introduction-to-convolutional-neural-network-8e39019b27c4.
[3] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is dicult. IEEE transactions on neural networks, 5(2):157-166, 1994.
[4] Richard Burden and JD Faires. Numerical analysis. Cengage Learning, 2004.
[5] Giuseppe Carleo and Matthias Troyer. Solving the quantum many-body problem with artificial neural networks. Science, 355(6325):602-606, feb 2017.
[6] Alex Graves and Navdeep Jaitly. Towards end-to-end speech recognition with recurrent neural networks. In International conference on machine learning, pages 1764-1772, 2014.
[7] Alex Graves, Abdel-Rahman Mohamed, ; Hinton, and Geoffrey null. Speech recognition with deep recurrent neural networks. IEEE, pages 6645-6649, 2013.
[8] Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997.
[9] Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. Reinforcement learning: A survey. Journal of articial intelligence research, 4:237-285,
1996.
[10] Nikhil Ketkar et al. Deep Learning with Python, volume 1. Springer, 2017.
[11] Yoon Kim. Convolutional neural networks for sentence classication. arXiv preprint arXiv:1408.5882, 2014.
[12] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. Technical report, 2014.
[13] Kyle Mills, Michael Spanner, and Isaac Tamblyn. Deep learning and the Schrodinger equation. Physical Review A, 96(4), OCT 18 2017.
[14] Keiron O'Shea and Ryan Nash. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.
[15] Kevin Ryczko, David A. Strubbe, and Isaac Tamblyn. Deep learning and density-functional theory. Physical Review A, 100(2), aug 2019.
[16] Mike Schuster and Kuldip K Paliwal. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11):2673-2681, 1997.
[17] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[18] Ilya Sutskever, James Martens, George Dahl, and Georey Hinton. On the importance of initialization and momentum in deep learning. In International conference on machine learning, pages 1139-1147, 2013.
[19] Tijmen Tieleman and Georey Hinton. Lecture 6.5-rmsprop, coursera: Neural networks for machine learning. University of Toronto, Technical Report, 2012.
[20] Yunchao Wei. Cnn: Single-label to multi-label. Technical report, 2014.
[21] Kun Yao, John E. Herr, David W. Toth, Ryker Mckintyre, and John Parkhill. The TensorMol-0.1 model chemistry: a neural network augmented with longrange physics. Chemical Science, 9(8):2261-2269, 2018.
[22] Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329, 2014.

簡易檢索 / 詳目顯示

相關論文