研究生: |
朱育欣 Chu, Yu-Hsin |
---|---|
論文名稱: |
利用生成式AI提升句子文法等級 Grammar Booster: Improving Sentence Grammar Levels Based on Generative AI |
指導教授: |
張俊盛
Chang, Jason S. |
口試委員: |
張智星
Jang, Jyh-Shing 鍾曉芳 Chung, Siaw-Fong |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2024 |
畢業學年度: | 113 |
語文別: | 英文 |
論文頁數: | 60 |
中文關鍵詞: | 語言模型 、生成式人工智慧 、序列標記 、重述或譯意 |
外文關鍵詞: | Language Model, Generative Artificial Intelligence, Sequence Tagging, Rephrasing or Paraphrasing |
相關次數: | 點閱:66 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出了一個英文句子文法改進系統,此系統能自動檢測語法元素,並針對句子的文句結構產生文法等級較高的近意文句建議。在我們的系統設計中,我們產生了兩組訓練數據集:一個用於檢測語法元素,另一個用於改進句子。此系統的實作方法中涉及利用語言模型偵測句法結構,並使用語言模型和生成式 AI 生成改進的句子。初步評估顯示,該方法在語法元素檢測和提升句子語法水平方面表現良好。
We present a sentence improvement system that automatically detects grammar elements and suggests appropriate alternative sentences with high-level syntactic structures. In our approach, two training datasets are based on graded gram- mar elements: for detecting grammar element and for improving sentences. The method involves detecting grammar elements in a sentence and generating an im- proved sentence using a language model and generative AI. Preliminary evaluation shows that the proposed method performs reasonably well in detecting grammar elements and in improving sentences with increased grammar levels.
J Charles Alderson. The cefr and the need for more research. The Modern Lan- guage Journal, 91(4):659–663, 2007.
Bashar Alhafni, Go Inoue, Christian Khairallah, and Nizar Habash. Advancements in arabic grammatical error detection and correction: An empirical investiga- tion. arXiv preprint arXiv:2305.14734, 2023.
Eric Brill. A simple rule-based part of speech tagger. In Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992, 1992.
Tien-Cuong Bui, Van-Duc Le, Hai-Thien To, and Sang Kyun Cha. Generative pre-training for paraphrase generation by representing and predicting spans in exemplars. In 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), pages 83–90. IEEE, 2021.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre- training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd.spaCy: Industrial-strength Natural Language Processing in Python, 2020. URL https://doi.org/10.5281/zenodo.1212303.
Liang Hu, Yanling Tang, Xinli Wu, and Jincheng Zeng. Considering optimization of english grammar error correction based on neural network. Neural Computing and Applications, pages 1–13, 2022.
David M. Magerman. Statistical decision-tree models for parsing. In 33rd An- nual Meeting of the Association for Computational Linguistics, pages 276–283, Cambridge, Massachusetts, USA, June 1995. Association for Computational Linguistics. doi: 10.3115/981658.981695. URL https://aclanthology.org/ P95-1037.
Alireza Mansouri, Lilly Suriani A↵endey, and Ali Mamat. Named entity recog- nition approaches. International Journal of Computer Science and Network Security, 8(2):339–344, 2008.
David Nadeau and Satoshi Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3–26, 2007.
Colin Ra↵el, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67, 2020.
Karthik Raman, Iftekhar Naim, Jiecao Chen, Kazuma Hashimoto, Kiran Yalasangi, and Krishna Srinivasan. Transforming sequence tagging into a seq2seq task. arXiv preprint arXiv:2203.08378, 2022.
Dadi Ramesh and Suresh Kumar Sanampudi. An automated essay scoring systems: a systematic literature review. Artificial Intelligence Review, 55(3):2495–2527, 2022.
Jiao Sun, Xuezhe Ma, and Nanyun Peng. Aesop: Paraphrase generation with adaptive syntactic control. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 5176–5189, 2021.
Yu Wang, Yuelin Wang, Kai Dang, Jie Liu, and Zhuo Liu. A comprehensive survey of grammatical error correction. ACM Transactions on Intelligent Systems and Technology (TIST), 12(5):1–51, 2021.
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement De- langue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Fun- towicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jer- nite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. Transformers: State-of-the-art natural language processing. In Qun Liu and David Schlangen, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Process- ing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-demos.6. URL https://aclanthology.org/2020.emnlp-demos.6.
Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Haibo Zhang, Xue Zhao, Wenqing Yao, and Boxing Chen. Gcpg: A general framework for control- lable paraphrase generation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 4035–4047, 2022.
Jingheng Ye, Yinghui Li, Qingyu Zhou, Yangning Li, Shirong Ma, Hai-Tao Zheng, and Ying Shen. Cleme: debiasing multi-reference evaluation for grammatical error correction. arXiv preprint arXiv:2305.10819, 2023.
Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, and Shuming Shi. Robustgec: Robust grammatical error correction against subtle context perturbation. arXiv preprint arXiv:2310.07299, 2023.
Jianing Zhou and Suma Bhat. Paraphrase generation: A survey of the state of the art. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 5075–5086, 2021.