簡易檢索 / 詳目顯示

研究生: 顏孜羲
Yen, Tzu Hsi
論文名稱: 自動學習運用於機器翻譯之樣式文法
Learning Synchronous Pattern Grammar for Machine Translation
指導教授: 張俊盛
Chang, Jason S.
口試委員: 柯淑津
陳浩然
徐嘉連
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 43
中文關鍵詞: 機器翻譯樣式文法同步樣式文法
外文關鍵詞: Machine Translation, Pattern Grammar, Synchronous Pattern Grammar
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本論文,我們提出了一個同步樣式文法自動學習方法,用於輔助語言學習者 翻譯語句至另一個語言。我們首先由平行語料庫的英文語句抽取英文的樣式文法, 接著在中文語句找出對應的中文樣式文法。在英文樣式文法部分,我們將英文句 子轉成一系列的短語,將短語中的字詞轉換成樣式元素,最後找出有效的英文的 動、名、形容詞的樣式文法。我們提出了一個以機器翻譯為模型的中文斷詞器。在 斷詞之後,透過機器翻譯的字詞配對,最終抽取出同步樣式文法。
    藉由同步樣式文法,我們建立了一個互動式線上寫作環境 — WriteAhead for translators。其自動顯示與輸入區的游標或滑鼠指標位置字詞相關的同步文法樣式 與範例,作為翻譯的範例與提示。


    In this paper, we introduce a method for learning Synchronous Pattern Grammar (SPG) for assisting learners in translating sentences from one language into another. In our approach, we learn English pattern grammar from a given corpus, and then project the pattern grammar to Chinese through a parallel corpus with alignment annotation. The method involves converting English sentences into sequences of phrase chunks, converting phrase chunks to pattern elements, and extracting salient patterns for content words (verbs, noun, and adjective). The method also involves developing a machine-translation based Chinese word segmenter, developing a base phrase clunker, and converting bilingual phrases to synchronous grammar patterns.
    With synchronous grammar patterns, we present an interactive writing environ- ment, WriteAhead for translators, that automatically extracts and displays relevant synchronous grammar patterns with examples to prompt the user as they translate or mouse around a translation draft during editing.

    Abstract 摘要 Acknowledgments Contents List of Figures List of Tables 1 Introduction 2 Related Work 3 Method 3.1 Problem Statement 3.2 Map-Reduce programming model 3.3 Extracting English Grammar Patterns 3.4 Tokenizing Chinese Sentences 3.5 Aligning English Pattern to Chinese 4 Experimental Setting 4.1 Experiments and setting 4.2 Training Data and Tools 5 Evaluation Results 6 Conclusion and Future Work Appendices A English Grammar Pattern Templates

    Academia Sinica Balanced Corpus 4.0 (2001). Academia Sinica. url: http://asbc. iis.sinica.edu.tw/.
    Brown, Peter F et al. (1990). “A statistical approach to machine translation”. In: Computational linguistics 16.2, pp. 79–85.
    Chen, Mei-Hua et al. (2011). “A cross-lingual pattern retrieval framework”. In: Pro- ceedings of the CICLing.
    Chiang, David (2005). “A hierarchical phrase-based model for statistical machine translation”. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp. 263– 270.
    Church, Kenneth Ward and Patrick Hanks (1990). “Word association norms, mutual information, and lexicography”. In: Computational linguistics 16.1, pp. 22–29.
    Ellis, Peter Beresford, Susan Hunston, and Elizabeth Manning (1996). Grammar Patterns 1: Verbs (COBUILD). Collins CoBUILD. isbn: 0003750620.
    Goldberg, Adele E (2009). “The nature of generalization in language”. In: Cognitive Linguistics 20.1, pp. 93–127.
    Hanks, Patrick (2004). “Corpus pattern analysis”. In: Euralex Proceedings. Vol. 1, pp. 87–98.
    Hunston, S. and G. Francis (2000). Pattern Grammar: A Corpus-driven Approach to the Lexical Grammar of English. Pattern Grammar: A Corpus-driven Ap- proach to the Lexical Grammar of English. John Benjamins Publishing Com- pany. isb
    Paquot, Magali (2010). Academic Vocabulary in Learner Writing: From Extraction to Analysis (Corpus and Discourse). Bloomsbury Academic. isbn: 1441130365.
    Patrick Hanks, Jane Bradbury (2013). Why do we need pattern dictionaries (and what is a pattern dictionary, anyway)? DICTIONARY News: 27.
    Rundell, M. (2005). Macmillan English Dictionary: For Advanced Learners of Amer- ican English; includes CD-ROM. Elt Dictionaries Series. Palgrave Macmillan. isbn: 9780333966723. url: https://books.google.com.tw/books?id= UHINHAAACAAJ.
    — (2007). Macmillan English Dictionary for Advanced Learners. Elt Dictionaries Series. Macmillan. isbn: 9781405025263. url: https://books.google.com. tw/books?id=v6SISQAACAAJ.
    S. Bullon, Geoffrey Leech (2007). “Longman Communication 3000 and the Longman Defining Vocabulary.” In: Longman Communication 3000. Pearson Longman, pp. 1–7.
    Smadja, Frank (1993). “Retrieving collocations from text: Xtract”. In: Computa- tional linguistics 19.1, pp. 143–177.
    Tsuruoka, Yoshimasa et al. (2005). “Developing a Robust Part-of-Speech Tagger for Biomedical Text”. In: Advances in Informatics. Ed. by Panayiotis Bozanis and Elias N. Houstis. Vol. 3746. B

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE