簡易檢索 / 詳目顯示

研究生: 張書豪
Chang, Shu Hao
論文名稱: 利用樣式文法自動改正介系詞錯誤
Preposition Error Correction based on Automatically Extracted Grammar Patterns
指導教授: 張俊盛
Jason Chang
口試委員: 蘇以文
徐嘉連
陳浩然
柯淑津
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 70
中文關鍵詞: 文法改錯樣式文法平行詞彙抽取維基修改記錄語料庫英國國家語料庫機器翻譯
外文關鍵詞: Grammar Error Correction, Grammar Pattern, Parallel Phrase Extraction, Wikipedia Revision Corpus, British National Corpus, Machine Translation
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本論文中,我們介紹一個新的方法,對於學習者提供正確的介係詞錯誤改正回饋。在我們的方法裡,我們從大型錯誤標記語料庫中抽取平行樣式文法。此外,我們也從一般英語語料庫中抽取樣式文法。這方法包含在訓練資料中自動辨識句子結構性關係,藉由字頻率自動抽取獨特的結構性關係,自動產生並過濾錯誤更正樣式文法。在執行階段,給定一個句子將被轉換成樣式文法,並應用訓練階段取得的樣式文法做篩選、排名介係詞錯誤更正建議回饋。我們實作了一個雛形系統WriteAhead,可自動抽取並展示鼠標周圍的相關的錯誤更正樣式文法和例句。我們初步的實驗和評估結果在一個公開的資料集中,顯示我們的方法在介係詞錯誤更正的效能上是合理的。


    In this paper, we introduce a new method for providing corrective feedback for preposition errors in learners' writing. In our approach, we extract Synchronous Grammar Patterns (SGP) for grammatical error correction (GEC) from a large error-annotated corpus. In addition, we also extract grammar patterns (GP) from a general English corpus to validate and supplement GEC patterns. The method involves automatically identifying syntactic patterns in the training data, automatically extracting distinct patterns with counts, and filtering and generating GEC patterns.
    At run-time, we identify grammar patterns in a given sentence. We apply our acquired GEC patterns to match the patterns in the given sentence and rank GEC patterns by frequency. We present \textit{WriteAhead}, a prototype system, that automatically extracts and displays relevant grammar error correction patterns with examples to prompt the user as they type or mouse around a draft. Preliminary experiments and evaluation results on a publicly available dataset, show our method works reasonably well for preposition errors.

    Abstract i Acknowledgments ii Contents iv List of Figures v List of Tables vi 1 Introduction 1 2 Related Work 4 3 Method 7 3.1 ProblemStatement ............................. 8 3.2 Parallel Phrase Extraction in Machine Translation . . . . . . . . . . . . . 10 3.3 ExtractingCorrectionPatterns ....................... 12 3.3.1 DataPre-processing ........................ 14 3.3.2 Generating and Filtering Consistent Block chunks . . . . . . . . 15 3.3.3 Generating Synchronous Grammar Patterns . . . . . . . . . . . . 16 3.3.4 GeneratingGrammarPatterns ................... 18 3.4 RetrievingandRankingSuggestions.................... 19 4 Experiment Setting 29 4.1 Resources.................................. 29 4.1.1 CollinsCobuildGrammarPatterns................. 30 4.1.2 GeniaTagger ............................ 30 4.2 Datasets................................... 30 4.2.1 Wikipedia Preposition Correction Corpus . . . . . . . . . . . . . 30 4.2.2 BritishNationalCorpus ...................... 31 4.2.3 CLC-FCEDataset ......................... 31 4.3 Evaluationmetrics ............................. 32 5 Evaluation and Discussion 33 5.1 AutomaticEvaluation............................ 34 5.2 Discussion.................................. 35 6 Conclusion and Future Work 37 Appendices 38 A.ExampleofSGPcorrectiveresult ....................... 39 B.ExampleofSGPincorrectiveresult ...................... 46 C.ExampleofGPcorrectiveresult........................ 47 D.ExampleofGPincorrectiveresult....................... 55 E.Exampleofnopatternsresult ......................... 64 References 67

    Cahill, A., Madnani, N., Tetreault, J. R., & Napolitano, D. (2013). Robust systems for preposition error correction using wikipedia revisions. In Hlt-naacl (pp. 507–517).
    Chodorow, M., Gamon, M., & Tetreault, J. (2010). The utility of article and preposition error correction systems for english language learners: Feedback and assessment. Language Testing.
    Chodorow, M., Tetreault, J. R., & Han, N.-R. (2007). Detection of grammatical errors involving prepositions. In Proceedings of the fourth acl-sigsem workshop on prepo- sitions (pp. 25–30).
    Dahlmeier, D., & Ng, H. T. (2011). Grammatical error correction with alternating struc- ture optimization. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1 (pp. 915–923).
    De Felice, R., & Pulman, S. (2013). Automatic detection of preposition errors in learner writing. Calico Journal, 26(3), 512–528.
    De Felice, R., & Pulman, S. G. (2007). Automatically acquiring models of preposition use. In Proceedings of the fourth acl-sigsem workshop on prepositions (pp. 45–50).
    De Felice, R., & Pulman, S. G. (2008). A classifier-based approach to preposition and determiner error correction in l2 english. In Proceedings of the 22nd international conference on computational linguistics-volume 1 (pp. 169–176).
    Eeg-Olofsson, J., & Knutsson, O. (2003). Automatic grammar checking for second language learners-the use of prepositions..
    Ferris, D. (1999). The case for grammar correction in l2 writing classes: A response to truscott (1996). Journal of second language writing, 8(1), 1–11.
    Ferris, D. R. (2004). The “grammar correction” debate in l2 writing: Where are we, and where do we go from here?(and what do we do in the meantime...?). Journal of second language writing, 13(1), 49–62.
    Gamon, M., Gao, J., Brockett, C., Klementiev, A., Dolan, W. B., Belenko, D., & Vander- wende, L. (2008). Using contextual speller techniques and language modeling for esl error correction. In Ijcnlp (Vol. 8, pp. 449–456).
    Graddol, D. (1997). The future of english?: A guide to forecasting the popularity of the english language in the 21st century.
    Han, N.-R., Tetreault, J. R., Lee, S.-H., & Ha, J.-Y. (2010). Using an error-annotated learner corpus to develop an esl/efl error correction system. In Lrec.
    Hermet, M., & Désilets, A. (2009). Using first and second language models to correct preposition errors in second language authoring. In Proceedings of the fourth work- shop on innovative use of nlp for building educational applications (pp. 64–72).
    Hunston, S., Francis, G., & Manning, E. (1996). Collins cobuild grammar patterns 1: verbs. London: HarperCollins Publishers.
    Hunston, S., Francis, G., & Manning, E. (1997). Grammar and vocabulary: Showing the connections. ELT journal, 51(3), 208–216.
    Izumi, E., Uchimoto, K., & Isahara, H. (2004). The overview of the sst speech corpus of japanese learner english and evaluation through the experiment on automatic detection of learners’ errors. In Lrec.
    Izumi, E., Uchimoto, K., Saiga, T., Supnithi, T., & Isahara, H. (2003). Automatic errordetection in the japanese learners’ english spoken data. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 2 (pp. 145– 148).
    Leacock, C., Chodorow, M., Gamon, M., & Tetreault, J. (2010). Automated grammat- ical error detection for language learners. Synthesis lectures on human language technologies, 3(1), 1–134.
    Lee, J., & Seneff, S. (2006). Automatic grammar correction for second-language learners. In Interspeech (pp. 1978–1981).
    Madnani, N., & Cahill, A. (2014). An explicit feedback system for preposition errors based on wikipedia revisions. ACL 2014, 79.
    Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acqui- sition of l2 grammar. Synthesizing research on language learning and teaching, 133–164.
    Tetreault, J. R., & Chodorow, M. (2008). The ups and downs of preposition error detec- tion in esl writing. In Proceedings of the 22nd international conference on compu- tational linguistics-volume 1 (pp. 865–872).
    Truscott, J. (1996). The case against grammar correction in l2 writing classes. Language learning, 46(2), 327–369.
    Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., & Tsujii, J. (2005). Developing a robust part-of-speech tagger for biomedical text. Advances in informatics, 382–392.
    Voorhees, E. M., et al. (1999). The trec-8 question answering track report. In Trec (Vol. 99, pp. 77–82).
    Yannakoudakis, H., Briscoe, T., & Medlock, B. (2011). A new dataset and method for automatically grading esol texts. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1
    (pp. 180–189).

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE