簡易檢索 / 詳目顯示

研究生: 蔣明撰
論文名稱: 混合單語與平行語料之搭配詞翻譯
Translating Collocation using Monolingual and Parallel Corpus
指導教授: 張俊盛
口試委員: 粱停
高照明
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 46
中文關鍵詞: 搭配詞翻譯統計式機器翻譯電腦輔助翻譯
外文關鍵詞: collocation translation, statistical machine translation, computer-assisted translation
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出一混合雙語平行語料與單語語料之動名詞搭配詞翻譯方法。本方法中主要包含了兩個搭配詞翻譯模型:組合式翻譯模型和雙向對應翻譯模型。組合式翻譯模型個別產生搭配詞的組成字翻譯,並再利用目標語言模型過濾出適當翻譯。雙向對應翻譯模型則直接利用平行語料訓練後之字對齊資訊直接產生搭配詞翻譯。在執行階段,對使用者輸入的英文搭配詞,兩個搭配詞翻譯模型個別產生一候選翻譯清單後,再將兩個模型的候選翻譯合併並重新排名,最後作為我們的系統輸出結果。實驗結果顯示,本方法所輸出之搭配詞翻譯,相較於傳統統計是機器翻譯系統翻譯有顯著的改善,可用來協助第二外語學習者和輔助雙語搭配詞辭典之編輯。


    摘要 i ABSTRACT ii 致謝詞 iii TABLE OF CONTENTS iv LIST OF FIGURES v LIST OF TABLES vi CHAPTER 1 INTRODUCTION 1 CHAPTER 2 RELATED WORK 5 CHAPTER 3 METHOD 8 3.1 Problem Statement 8 3.2 Extracting Collocation Translation from Parallel Corpus 9 3.2.1 Generate word alignment from parallel corpus 10 3.2.2 Build the combination translation model 13 3.2.3 Build the bidirectional alignment translation model 15 3.3 The Run Time Process 17 CHAPTER 4 Experimental Setting and Results 21 4.1 Experimental Settings 21 4.2 Methods Compared 23 4.3 Evaluation Data Sets and Metrics 24 4.4 Tuning Parameters 28 4.5 Evaluation Results 29 4.6 Discussions 31 CHAPTER 5 Conclusion and Future Work 33 References 34 Appendix – Test data 37

    Cao Y. and Li H. 2002. Base noun phrase translation using web data and the EM algorithm. In Proceedings of the 19th international conference on computational linguistics-volume 1, Association for Computational Linguistics. 1 p.

    Cohen J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1):37-46.

    Dagan I. and Church K. 1994. Termight: Identifying and translating technical terminology. In Proceedings of the fourth conference on applied natural language processing, Association for Computational Linguistics. 34 p.

    Fung P and McKeown K. 1997. A technical word-and term-translation aid using noisy parallel corpora across language groups. Machine Translation 12(1):53-87.

    Koehn P. 2004. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. Machine Translation: From Real Users to Research :115-24.

    Koehn P. and Knight K. 2003. Feature-rich statistical translation of noun phrases. Proceedings of the 41st annual meeting on association for computational linguistics-volume 1, Association for Computational Linguistics. 311 p.

    Koehn P., Och F. J. and Marcu D. 2003. Statistical phrase-based translation. Proceedings of the 2003 conference of the north American, chapter of the association for computational linguistics on human language technology-volume 1, Association for Computational Linguistics. 48 p.

    Koehn P., Hoang H., Birch A., Callison-Burch C., Federico M., Bertoldi N., Cowan B., Shen W., Moran C. and Zens R. 2007. Moses: Open source toolkit for statistical machine translation. Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Association for Computational Linguistics. 177 p.

    Kupiec J. 1993. An algorithm for finding noun phrase correspondences in bilingual corpora. Proceedings of the 31st annual meeting on association for computational linguistics, Association for Computational Linguistics. 17 p.

    Loper E. and Bird S. 2002. NLTK: The natural language toolkit. Proceedings of the ACL-02 workshop on effective tools and methodologies for teaching natural language processing and computational linguistics-volume 1, Association for Computational Linguistics. 63 p.

    Lü Y. and Zhou M. 2004. Collocation translation acquisition using monolingual corpora. Proceedings of the 42nd annual meeting on association for computational linguistics, Association for Computational Linguistics. 167 p.

    Marcu D. and Wong W. 2002. A phrase-based, joint probability model for statistical machine translation. In Proceedings of the ACL-02 conference on empirical methods in natural language processing-volume 10. Association for Computational Linguistics. 133 p.

    Ohmori K. and Higashida M. 1999. Extracting bilingual collocations from non-aligned parallel corpora. In Proeeding. of the 8th international conference on theoretical and methodological issues in machine translation (TMI99). Citeseer. 88 p.

    Oxford University Press. 2009. Oxford collocations dictionary 2nd . USA: Oxford University Press.

    Papineni K., Roukos S., Ward T. and Zhu W. J. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics. 311 p.
    J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics, 33(1):159-174. International Biometric Society

    Seretan V. and Wehrli E. 2007. Collocation translation based on sentence alignment and parsing. Actes de la 14e conférence sur le traitement automatique des langues naturelles (TALN 2007). Citeseer. 401 p.

    Smadja F, McKeown KR, Hatzivassiloglou V. 1996. Translating collocations for bilingual lexicons: A statistical approach. Computational Linguistics 22(1):1-38.

    Zhou M, Ding Y, Huang C. 2001. Improving translation selection with a new translation model trained by independent monolingual corpora. Computational Linguistics and Chinese Language Processing 6(1):1-26.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE