簡易檢索 / 詳目顯示

研究生: 張祐維
Chang, Yu Wei
論文名稱: Automatic Correction of Grammatical Errors in English learner Writing
指導教授: 張俊盛
口試委員: 張嘉惠
陳信希
柯淑津
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 42
中文關鍵詞: 自動文法改錯英語學習語言模型易混淆字
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 英語學習者在寫作時,常犯下各種錯誤而不自知,本篇論文中,我們提出一個英文自動文法改錯的方法。我們利用語言模型,及數個針對性的特徵值,為不同類型的常見錯誤設計改錯模組,我們將易混淆字代換入學習者書寫的原句,得到多組候選句,並衡量原句與候選句何者較優。實驗結果顯示,我們的方法在改正英語學習者文法錯誤上以不錯的表現。


    In this paper, we describe a system for correcting grammatical errors in texts written by non-native English learners. In our approach, a given sentence is sent to a number of modules, each focuses on a specific error type. The modules apply different approaches tailored to different types of errors, mostly based on probabilities of language models. A main program integrates corrections from these modules and outputs the corrected sen- tence. We evaluated our system on the official test data of the CoNLL-2014 shared task and obtained 0.36 in F-measure.

    Abstract i Acknowledgments ii Contents iv List of Figures vi List of Tables vii 1 Introduction ...1 2 Related Work ... 5 3 Method ... 9 3.1 ProblemStatement ............................. 9 3.2 Generatingconfusionset .......................... 11 3.3 Modulesfocusingonspecificerrors .................... 12 4 Experiment Setting ... 19 4.1 Resources.................................. 19 4.1.1 British National Corpus and English Gigaword Corpus . . . . . . 20 4.1.2 SRILMandRNNLMToolkit.................... 20 4.1.3 Wikipedia.............................. 23 4.1.4 Academicwordlist......................... 24 4.1.5 Hunspellspellingchecker ..................... 25 4.1.6 StanfordParser........................... 25 4.2 DatasetandEvaluationmetrics....................... 25 4.2.1 Dataset ............................... 25 4.2.2 Evaluationmetrics ......................... 26 5 Result and discussion 29 5.1 Overallperformance ............................ 29 5.2 Systemcompared.............................. 30 5.2.1 Nounnumberserrors........................ 30 5.2.2 Articleerrors ............................ 31 5.2.3 Wordformerrors.......................... 33 5.2.4 Subject-verbagreement ...................... 33 5.3 Erroranalysis ................................ 34 6 Conclusion and Future Work 35 Appendices 37 A.Exampleofincorrectcorrections ....................... 38

    Aoife Cahill, Nitin Madnani, Joel R Tetreault, and Diane Napolitano. Robust systems for preposition error correction using wikipedia revisions. In HLT-NAACL, pages 507–517. Citeseer, 2013.

    Joanne F Carlisle. Awareness of the structure and meaning of morphologically complex words: Impact on reading. Reading and Writing, 12(3):169–190, 2000.

    Daniel Dahlmeier and Hwee Tou Ng. Correcting semantic collocation errors with l1- induced paraphrases. In Proceedings of the Conference on Empirical Methods in Natu- ral Language Processing, pages 107–117. Association for Computational Linguistics, 2011.

    Daniel Dahlmeier and Hwee Tou Ng. Better evaluation for grammatical error correction. In Proceedings of the 2012 Conference of the North American Chapter of the Associ- ation for Computational Linguistics: Human Language Technologies, pages 568–572. Association for Computational Linguistics, 2012.

    Daniel Dahlmeier, Hwee Tou Ng, and Eric Jun Feng Ng. Nus at the hoo 2012 shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, pages 216–224. Association for Computational Linguistics, 2012.

    Daniel Dahlmeier, Hwee Tou Ng, and Siew Mei Wu. Building a large annotated corpusof learner english: The nus corpus of learner english. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pages 22– 31. Citeseer, June 2013.

    Michael Gamon, Jianfeng Gao, Chris Brockett, Alexandre Klementiev, William B Dolan, Dmitriy Belenko, and Lucy Vanderwende. Using contextual speller techniques and language modeling for esl error correction. In IJCNLP, volume 8, pages 449–456. Citeseer, 2008.

    Na-Rae Han, Joel R Tetreault, Soo-Hwa Lee, and Jin-Young Ha. Using an error-annotated learner corpus to develop an esl/efl error correction system. In LREC, 2010.

    Michael J Kieffer and Nonie K Lesaux. The role of derivational morphology in the reading comprehension of spanish-speaking english language learners. Reading and Writing, 21(8):783–804, 2008.

    Dan Klein and Christopher D Manning. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, pages 423–430. Association for Computational Linguistics, 2003.

    Ekaterina Kochmar and Ted Briscoe. Capturing anomalies in the choice of content words in compositional distributional semantic space. In RANLP, pages 365–372, 2013.

    John Lee. Automatic article restoration. In Proceedings of the Student Research Workshop at HLT-NAACL 2004, pages 31–36. Association for Computational Linguistics, 2004.

    Eric Mays, Fred J Damerau, and Robert L Mercer. Context based spelling correction. Information Processing & Management, 27(5):517–522, 1991.

    Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernocky`, and Sanjeev Khudanpur. Recurrent neural network based language model. In INTERSPEECH, pages 1045– 1048, 2010.

    Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, and Jan Cernocky`. Em- pirical evaluation and combination of advanced language modeling techniques. In IN- TERSPEECH, pages 605–608, 2011.

    Jeff Mitchell and Mirella Lapata. Composition in distributional models of semantics. Cognitive science, 34(8):1388–1429, 2010.

    Tomoya Mizumoto, Yuta Hayashibe, Mamoru Komachi, Masaaki Nagata, and Yu Mat- sumoto. The effect of learner corpus size in grammatical error correction of esl writings. 2012.

    Hwee Tou Ng, Siew Mei Wu, Yuanbin Wu, Christian Hadiwinoto, and Joel Tetreault. The conll-2013 shared task on grammatical error correction. In Proceedings of CoNLL, 2013.

    Andreas Stolcke et al. Srilm-an extensible language modeling toolkit. In INTERSPEECH, 2002.

    Toshikazu Tajiri, Mamoru Komachi, and Yuji Matsumoto. Tense and aspect error correc- tion for esl learners using global context. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, pages 198– 202. Association for Computational Linguistics, 2012.

    Joel Tetreault, Jennifer Foster, and Martin Chodorow. Using parse features for preposition selection and error detection. In Proceedings of the ACL 2010 Conference Short Papers, pages 353–358. Association for Computational Linguistics, 2010.

    Joel R Tetreault and Martin Chodorow. The ups and downs of preposition error detec- tion in esl writing. In Proceedings of the 22nd International Conference on Computa- tional Linguistics-Volume 1, pages 865–872. Association for Computational Linguis- tics, 2008.

    Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. A new dataset and method for automatically grading esol texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 180–189. Association for Computational Linguistics, 2011.

    Zheng Yuan and Mariano Felice. Constrained grammatical error correction using statisti- cal machine translation. CoNLL-2013, page 52, 2013.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE