簡易檢索 / 詳目顯示

研究生: 侯芙堤
Hou, Fu-Ti
論文名稱: 運用語言與統計分析解釋文法錯誤
Explaining Grammatical Error Correction Through Linguistic and Statistical Analysis
指導教授: 張俊盛
Chang, Jason S.
口試委員: 陳浩然
Chen, Hao-Jan
張智星
Jang, Jyh-Shing
楊謦瑜
Yang, Ching-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 58
中文關鍵詞: 文法改錯文法錯誤標注修正回饋
外文關鍵詞: Grammatical Error Correction, Grammatical Error Annotation, Corrective Feedback
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文針對一組 <錯誤句子,改正句子> 提出為不同類型文法錯誤產生解釋的方法。我們的做法是,將輸入的每組平行句子,與一系列的規則做對應,從中提取錯誤類型和引發錯誤的字詞。我們的方法需要分析不同類型的錯誤,設計規則來提取引發錯誤的字詞,以及建立中間挖空的解釋模板。在執行階段時,輸入的平行句子會一一對應規則,來提取解釋模板所需要的元素,並填入模板的空格中產生解釋。我們實作上述方法,產生文法改錯的解釋,並提出雛型系統ExplainInDepth。這個系統產生的解釋包含:錯誤類型、問題詞(亦即引發錯誤的字詞),以及一段簡短的改錯描述。我們以常見錯誤詞典和學習者語料庫的資料作為輸入,將輸出的解釋由人工評估,結果顯示這個系統能為文法改錯產生不錯的解釋。


    This thesis introduces a method for generating an explanation to a given <erroneous, corrected> sentence pair in order to explain various types of grammatical errors. In our approach, the sentence pair is matched against a set of rules to classify the error type and to identify the error-causing word. The method involves analyzing various types of errors, designing rules to retrieve error-causing words, and writing explanation templates with slots. At run-time, the sentence pair is matched against a set of rules, and explanation slot elements are retrieved to fill the templates in order to generate explanations. We present ExplainInDepth, a prototype system that implements the method to provide explanatory feedback, including error type, problem word (i.e., error-causing word), and a short description. Explanations for a set of sentence pairs from a dictionary of common errors and a learner corpus are evaluated manually and the results show that the proposed prototype generates reasonably good explanations for GEC.

    Abstract i 摘要 ii 誌謝辭 iii Contents iv List of Figures vi List of Tables viii 1 Introduction 1 2 Related Work 5 3 Method 7 3.1 Problem Statement..........................7 3.2 Learning to Identify Problem Words for Different Error Types..........................9 3.2.1 Analyzing Errors and Identifying Problem Words..........................9 3.2.2 Constructing Template Set for Each Error Type..........................13 3.3 Filling Templates at Run-Time..........................19 3.3.1 Retrieving Template Parameters from Input Pair..........................19 3.3.2 Generating Template Parameters for Lexical Errors..........................21 3.3.3 Generating Template Parameters for Pattern Errors..........................23 3.3.4 Generating Template Parameters for Determiner Errors..........................25 3.3.5 Generating Template Parameters for Noun Errors..........................27 3.3.6 Generating Template Parameters for Verb Errors..........................29 4 Experiments and Evaluation..........................31 4.1 Datasets..........................31 4.2 Experimental Setup..........................34 4.3 Evaluation and Discussion..........................36 4.3.1 Selecting Sentence Pair for Each Type..........................36 4.3.2 Evaluation Metrics..........................38 4.3.3 Evaluating Problem Word..........................39 4.3.4 Evaluation Results..........................43 4.3.5 Discussion..........................46 5 Conclusion and Future Work 51 References 52 Appendix 55

    1. Bas Aarts. 2001. English syntax and argumentation. Palgrave, New York, 2nd edition.
    2. Collin F. Baker, Charles J. Fillmore and John B. Lowe. 1998. The Berkeley FrameNet Project. In Proceedings of the 36th annual meeting on Association for Computational Linguistics: 86-90.
    3. Joanne Boisson, Ting-Hui Kao, Jian-Cheng Wu, Tzu-Hsi Yen and Jason S. Chang. 2013. Linggle: a Web-scale Linguistic Search Engine for Words in Context. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations:139–144.
    4. Christopher Bryant, Mariano Felice and Ted Briscoe. 2017. Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 793–805.
    5. Mark Davies. 2008-. The Corpus of Contemporary American English (COCA). Available online at https://www.english-corpora.org/coca/.
    6. Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA.
    7. Gill Francis, Susan Hunston, and Elizabeth Manning. 1996. Grammar patterns 1: verbs. HarperCollins.
    8. Gill Francis, Elizabeth Manning and Susan Hunston. 1998. Grammar Patterns II: Nouns and adjectives. HarperCollins.
    9. Michael Gamon, Claudia Leacock, Chris Brockett, William B. Dolan, Jianfeng Gao, Dmitriy Belenko and Alexandre Klementiev. 2009. Using Statistical Techniques and Web Search to Correct ESL Errors. CALICO Journal, 26(3):491-511.
    10. Matthew Honnibal, Ines Montani, Sofie Van Lan-deghem, and Adriane Boyd. 2020. spaCy:Industrial-strength Natural Language Processing inPython.
    11. Susan Hunston and Gill Francis. 2000. Pattern grammar: A corpus-driven approach to the lexical grammar of english. John Benjamins Publishing.
    12. Yi-Huei Lai and Jason S. Chang. 2019. TellMeWhy: Learning to Explain Corrective Feedback for Second Language Learners. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pages 235-240.
    13. Yu-Chun Lo, Jhih-Jie Chen, Chingyu Yang and Jason S. Chang. 2018. Cool English: A grammatical error correction system based on large learner corpora. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations: 82-85.
    14. Nitin Madnani and Aoife Cahill. 2014. An explicit feedback system for preposition errors based on wikipedia revisions. In Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications: 79-88.
    15. Tom McArthur. 1999. Longman lexicon of contemporary English. Longman.
    16. Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton and Christopher D. Manning. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.
    17. Michael Rundell (Ed.). 2002. Macmillan English Dictionary for Advanced Learners. Macmillan.
    18. Nathan Schneider and Chuck Wooters. 2017. The NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 1-6.
    19. Karin Kipper Schuler. 2005. VerbNet: A broad-coverage, comprehensive verb lexicon. University of Pennsylvania.
    20. Ben Swanson and Elif Yamangil. 2012. Correction detection and error type selection as an ESL educational aid. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 357–361.
    21. Nigel D. Turton and J. B. Heaton. 1996. Longman Dictionary of common errors. Longman.
    22. Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. 2011. A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-Volume 1, pages 180–189.

    QR CODE