研究生: |
廖元群 Liao, Yuan-Qun |
---|---|
論文名稱: |
利用強化學習及大型語言模型進行熱力學性質分子設計 Reinforcement Learning and Large Language Model for Thermodynamic Properties Molecular Design |
指導教授: |
汪上曉
Wong, Shan-Hill 姚遠 Yao, Yuan |
口試委員: |
鄭西顯
Jang, Shi-Shang 康嘉麟 Kang, Jia-Lin |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 化學工程學系 Department of Chemical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 76 |
中文關鍵詞: | 分子設計 、MolDQN 、大型語言模型 、ChatGPT 、GPT-3 |
外文關鍵詞: | Molecular design, MolDQN, large language models, ChatGPT, GPT-3 |
相關次數: | 點閱:56 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在化學領域中,尋找具備理想性質的分子是一項極具挑戰性的任務,尤其是對於特用化學品或新藥的設計而言。研究人員必須在廣闊的化學空間中尋找少數具備理想性質的分子。近年來隨著電腦效能的提升,機器學習技術得以迅速發展,並提供了多種方法來預測分子性質和進行化學產品設計,這些方法統稱為電腦輔助分子設計(Computer aided molecular design, CAMD),大幅縮短了化學產品的開發週期並降低了研發成本。
在本文中我們使用名為MolDQN的強化學習方法以及使用大型語言模型GPT-3的微調模型進行研究,以簡化分子線性輸入規範(Simplified Molecular Input Line Entry Specification, SMILES)為輸入,透過兩種截然不同的模型來生成具有特定溶解度參數之分子。
MolDQN結合強化學習中的DQN與化學領域知識來進行分子性質優化任務,只允許合理的修飾動作來保證優化的分子都是有效的分子,以修飾分子為動作,理想性質為獎勵,在不進行任何預訓練的情況下從頭開始產生分子。在研究中我們發現在溶解度參數目標條件生成上,可以在只提供獎勵條件下學習到如何組裝或是修飾一個分子,在生成目標範圍內分子的百分比高達50%。
並且本文在研究中測試了ChatGPT對分子設計實驗之規劃能力,結果表明在大方向建議上ChatGPT能有效率的提供建議。在描述精確的情況下,接著一步步引導,可以有非常高效的產生模擬代碼。此外也利用ChatGPT之基礎模型GPT-3進行微調,執行few-shot learning的分子設計。提供分子沸點作為容易獲得的輔助性質,並尋找具有特定溶解度參數的分子。結果顯示,在提示詞明確要求的情況下,模型可以有效的學會生成合理SMILES,在生成符合極端溶解度參數之分子增加了一倍的效果,但在較一般溶解度參數時仍然有發展空間。
In chemistry, finding molecules with desirable properties is a challenging task, especially for the design of specific chemicals or new drugs. Researchers have to search for a small number of molecules with ideal properties in a wide range of chemical space. In recent years, with the improvement of computer performance, machine learning technology has been rapidly developed and provides various methods for predicting molecular properties and designing chemical products, which are commonly referred to as Computer Aided Molecular Design (CAMD), dramatically shortening the development cycle and reducing the cost of developing chemical products.
In this paper we use an enhanced learning method called MolDQN and a fine-tuning model using the large-scale language model GPT-3. Using the Simplified Molecular Input Line Entry Specification (SMILES) as input, two distinct models are used to generate molecules with specific solubility parameters.
MolDQN combines DQN from reinforcement learning with chemical domain knowledge for molecular property optimization tasks, allowing only reasonable modifying actions to ensure that the optimized molecules are valid molecules, using modified molecules as actions and ideal properties as rewards, and generating molecules from the beginning without any pre-training. In our research, we found that for solubility parameter target generation, it is possible to learn how to assemble or modify a molecule with only reward provided, and the percentage of generated molecules in the target range is up to 50%.
Moreover, we examined the ability of ChatGPT to plan molecular design experiments, and the results show that ChatGPT can efficiently provide suggestions in terms of general direction. With precise descriptions and step-by-step guidance, simulation codes can be generated efficiently. ChatGPT's base model, GPT-3, is also used to perform fine-tuning of the molecular design for few-shot learning. The boiling points of the molecules were provided as easy-to-obtain auxiliary properties, and molecules with specific solubility parameters were searched. Results show that the model can effectively learning to generate reasonable SMILES when prompts are explicitly requested, doubling the effectiveness in generating molecules that meet the extreme solubility parameter, but there is still development potential for the more general solubility parameter.
[1] Frühbeis, H., Klein, R., & Wallmeier, H. (1987). Computer‐Assisted Molecular Design (CAMD)—An Overview. Angewandte Chemie International Edition in English, 26(5), 403-418.
[2] Ng, L. Y., Chong, F. K., & Chemmangattuvalappil, N. G. (2015). Challenges and opportunities in computer-aided molecular design. Computers & Chemical Engineering, 81, 115-129.
[3] Austin, N. D., Sahinidis, N. V., & Trahan, D. W. (2016). Computer-aided molecular design: An introduction and review of tools, applications, and solution techniques. Chemical Engineering Research and Design, 116, 2-26.
[4] Joback, K. G., & Reid, R. C. (1987). Estimation of pure-component properties from group-contributions. Chemical Engineering Communications, 57(1-6), 233-243. https://doi.org/10.1080/00986448708960487
[5] Roubehie Fissa, M., Lahiouel, Y., Khaouane, L., & Hanini, S. (2019). QSPR estimation models of normal boiling point and relative liquid density of pure hydrocarbons using MLR and MLP-ANN methods. Journal of Molecular Graphics and Modelling, 87, 109-120. https://doi.org/https://doi.org/10.1016/j.jmgm.2018.11.013
[6] Paduszynski, K., & Domanska, U. (2014). Viscosity of ionic liquids: an extensive database and a new group contribution model based on a feed-forward artificial neural network. Journal of chemical information and modeling, 54(5), 1311-1324.
[7] Liu, Q., Allamanis, M., Brockschmidt, M., & Gaunt, A. (2018). Constrained graph variational autoencoders for molecule design. Advances in neural information processing systems, 31.
[8] Segler, M. H., Kogej, T., Tyrchan, C., & Waller, M. P. (2018). Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS central science, 4(1), 120-131.
[9] Sanchez-Lengeling, B., Outeiral, C., Guimaraes, G. L., & Aspuru-Guzik, A. (2017). Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC).
[10] Elton, D. C., Boukouvalas, Z., Fuge, M. D., & Chung, P. W. (2019). Deep learning for molecular design—a review of the state of the art. Molecular Systems Design & Engineering, 4(4), 828-849.
[11] Venkatasubramanian, V., Chan, K., & Caruthers, J. M. (1994). Computer-aided molecular design using genetic algorithms. Computers & Chemical Engineering, 18(9), 833-844.
[12] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484-489.
[13] Mousavi, S. S., Schukat, M., & Howley, E. (2018). Deep reinforcement learning: an overview. In Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016: Volume 2 (pp. 426-440). Springer International Publishing.
[14] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[15] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., ... & Lample, G. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
[16] Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32.
[17] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
[18] Lund, B. D., & Wang, T. (2023). Chatting about ChatGPT: how may AI and GPT impact academia and libraries?. Library Hi Tech News, 40(3), 26-29.
[19] Daylight Chemical Information System. https://www.daylight.com/
[20] Durant, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL keys for use in drug discovery. Journal of chemical information and computer sciences, 42(6), 1273-1280.
[21] Morgan, H. L. (1965). The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. Journal of chemical documentation, 5(2), 107-113.
[22] RDKit: Open-source cheminformatics. https://www.rdkit.org
[23] Rogers, D., & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of chemical information and modeling, 50(5), 742-754.
[24] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
[25] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.
[26] Otterlo, M. V., & Wiering, M. (2012). Reinforcement learning and markov decision processes. In Reinforcement learning (pp. 3-42). Springer, Berlin, Heidelberg.
[27] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3), 279-292.
[28] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. nature, 518(7540), 529-533.
[29] Zhou, Z., Kearnes, S., Li, L., Zare, R. N., & Riley, P. (2019). Optimization of molecules via deep reinforcement learning. Scientific reports, 9(1), 1-10.
[30] Rogers, D., & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of chemical information and modeling, 50(5), 742-754.
[31] Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S., & Hopkins, A. L. (2012). Quantifying the chemical beauty of drugs. Nature chemistry, 4(2), 90-98.
[32] Chung, N. C., Miasojedow, B., Startek, M., & Gambin, A. (2019). Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data. BMC bioinformatics, 20(15), 1-11.
[33] Alshehri, A. S., Gani, R., & You, F. (2020). Deep learning and knowledge-based methods for computer-aided molecular design—toward a unified approach: State-of-the-art and future directions. Computers & Chemical Engineering, 141, 107005
[34] Rangarajan, S. (2022). Towards a chemistry-informed paradigm for designing molecules. Current Opinion in Chemical Engineering, 35, 100717.
[35] Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., ... & Zhang, Y. (2023). Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
[36] Peng, B., Li, C., He, P., Galley, M., & Gao, J. (2023). Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
[37] "GPT-3.," OpenAI, [Online]. Available: https://beta.openai.com/docs/introduction.
[38] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
[39] OpenAI .(2020). Language Models are Few-Shot Learners. arXiv:2005.14165v4
[40] Mishra, S., Khashabi, D., Baral, C., Choi, Y., & Hajishirzi, H. (2021). Reframing Instructional Prompts to GPTk's Language. arXiv preprint arXiv:2109.07830.