簡易檢索 / 詳目顯示

研究生: 吳奇恩
Wu, Chi-En
論文名稱: 自動學習同步樣式文法應用於輔助寫作與翻譯
Learning Synchronous Grammar Patterns for Assisted Writing and Translation
指導教授: 張俊盛
Chang, Jason
口試委員: 蘇豐文
Soo, Von-Wun
陳浩然
Chen, Hao-Jan
學位類別: 碩士
Master
系所名稱:
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 53
中文關鍵詞: 語法歸納電腦輔助寫作機器翻譯
外文關鍵詞: Grammar Induction, Computer Assisted Writing, Machine Translation
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 我們提出了一個新的方法,能夠從平行語料中擷取出同步樣式文法,可以輔助 英語學習者英文寫作,以及用於機器翻譯。我們的方法涉及了從英文句子中自 動擷取出樣式文法,並將擷取出的英文樣式文法對應到中文的樣式文法上,最 後挑選出具有代表性的同步樣式文法及例句。我們將擷取出的文法及例句展示 在一個互動式的寫作系統,Translation Pattern Assistant,提供英文的寫作建 議及參考中文翻譯。在實驗結果的評估中顯示了我們的系統提供的同步樣式文 法有著不錯的準確性。此外,我們也提出了一個使用同步樣式文法來改善機器 翻譯品質的方法。


    We introduce a method for learning to extract Synchronous Grammar Patterns (SGP) from a parallel corpus that can be used in assisted writing for English learners, as well as machine translation. In our approach, The method involves automatically recognizing the grammar patterns in source language, automatically aligning the source grammar patterns to target, and automatically selecting good SGPs and examples. We present a prototype system, Translation Pattern Assis- tant (TPA), that applies our method to providing English writing suggestion with Chinese translation. The evaluation on a set of randomly sampled SGP shows that TPA provides satisfactory suggestion for learners. Additionally, we propose a method for using SGP to improve machine translation quality.

    Abstract i 摘要 ii 致謝 iii Contents iv List of Figures vi List of Tables vii 1 Introduction 1 2 Related Work 6 3 Methodology 10 3.1 Problem Statement........................... 10 3.2 Retrieving English Grammar Patterns ................ 12 3.2.1 Annotating Elements of Each Word from Sentences . . . . . 12 3.2.2 Retrieving Grammar Patten Templates for Keywords . . . . 13 3.3 Extracting Synchronous Grammar Patterns . . . . . . . . . . . . . 15 3.3.1 Tokenizing and Tagging Chinese Sentence . . . . . . . . . . 16 3.3.2 Aligning Word Across Source and Target Sentences . . . . . 17 3.3.3 Aligning English Grammar Pattern to Chinese . . . . . . . . 17 3.4 Selecting Representative SGP and Examples . . . . . . . . . . . . . 22 3.4.1 Filtering Out the Incorrect Chinese Pattern . . . . . . . . . 22 3.4.2 Reranking the Chinese Pattern ................ 24 3.4.3 Selecting Good SGP Examples ................ 26 3.5 Applying SGP to machine translation................. 28 3.5.1 ParsingSourceSentence .................... 29 3.5.2 Transforming and Linerizing Parsing Tree . . . . . . . . . . 29 3.5.3 Translating Reordered Sentence with Existing MT System . 30 4 Experiment and Evaluation 32 4.1 ExperimentalSetting.......................... 32 4.2 DataSetsandTools .......................... 33 4.3 EvaluationandDiscussion ....................... 36 4.3.1 EvaluationMetrics ....................... 36 4.3.2 EvaluationResults ....................... 37 4.3.3 Discussion............................ 39 5 Conclusion and Future Work 43 Appendices 45 A GrammarPatternTemplates...................... 46 B WeightTable .............................. 47 Reference 51

    Mauro Cettolo, Christian Girardi, and Marcello Federico. Wit3: Web inventory of transcribed and translated english talks. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT), pages 261–268, Trento, Italy, May 2012.
    Jim Chang and Jason S Chang. Writeahead2: Mining lexical grammar patterns for assisted english writing. In HLT-NAACL, pages 106–110, 2015.
    David Chiang. A hierarchical phrase-based model for statistical machine trans- lation. In Proceedings of the 43rd Annual Meeting on Association for Compu- tational Linguistics, pages 263–270. Association for Computational Linguistics, 2005.
    Collins COBUILD. Collins cobuild advanced learner’s dictionary, 2014.
    Chris Dyer, Victor Chahuneau, and Noah A Smith. A simple, fast, and effective reparameterization of ibm model 2. Association for Computational Linguistics, 2013.
    Susan Hunston and Gill Francis. Pattern grammar: A corpus-driven approach to the lexical grammar of english. Computational Linguistics, 27(2), 2000.
    Adam Kilgarriff, Milos Husa ́k, Katy McAdam, Michael Rundell, and Pavel Rychly`. Gdex: Automatically finding good dictionary examples in a corpus. In Proc. Euralex, 2008.
    Pierre Lison and J ̈org Tiedemann. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles. 2016.
    Wei-Yun Ma and Keh-Jiann Chen. Introduction to ckip chinese word segmentation system for the first international chinese word segmentation bakeoff. In Proceed- ings of the second SIGHAN workshop on Chinese language processing-Volume 17, pages 168–171. Association for Computational Linguistics, 2003.
    Oliver Mason and Susan Hunston. The automatic recognition of verb patterns: A feasibility study. International Journal of Corpus Linguistics, 9(2):253–270, 2004.
    Collins COBUILD Grammar Patterns. 1: Verbs. Collins COBUILD, the Univer- sity of Birmingham, 1996.
    Collins Cobuild Grammar Patterns. 2: Nouns and adjectives, 1998.
    John Sinclair. Collins Cobuild grammar patterns:[helping learners with real En-
    glish]. 1. Verbs. Harper Collins, 1996.
    Richard Socher, John Bauer, Christopher D Manning, and Andrew Y Ng. Parsing
    with compositional vector grammars. In ACL (1), pages 455–465, 2013.
    Liang Tian, Derek F Wong, Lidia S Chao, Paulo Quaresma, Francisco Oliveira, and Lu Yi. Um-corpus: A large english-chinese parallel corpus for statistical machine translation. In LREC, pages 1837–1842, 2014.
    Yoshimasa Tsuruoka, Yuka Tateishi, Jin-Dong Kim, Tomoko Ohta, John Mc- Naught, Sophia Ananiadou, and Jun’ichi Tsujii. Developing a robust part-of- speech tagger for biomedical text. In Panhellenic Conference on Informatics, pages 382–392. Springer, 2005.
    Dekai Wu. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational linguistics, 23(3):377–403, 1997.
    Jiajin Xu. Ted english chinese parallel corpus of speeches 1.0. 2012.

    QR CODE