簡易檢索 / 詳目顯示

研究生: 簡嘉言
Jia-Yan Jian
論文名稱: 以統計和語言特徵為本之搭配詞翻譯記憶體抽取
Collocational Translation Memory Extraction Based on Statistical and Linguistic Information
指導教授: 張俊盛
Jason S. Chang
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2005
畢業學年度: 94
語文別: 英文
論文頁數: 45
中文關鍵詞: 雙語搭配詞搭配詞擷取搭配詞索引典
外文關鍵詞: Bilingual Collocation Extraction, Collocational Translation, Memory, Collocational Concordancer
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出一個利用平行語料庫抽取雙語搭配詞的方法。我們的方法整合統計以及語言資訊,能有效地截取出雙語搭配詞。整合詞性、詞塊、子句三種語言分析,可以精準地抽取出英文之搭配詞結構,例如:動詞與名詞之搭配詞、形容詞與名詞之搭配詞等。本論文中所提出的方法是利用大型的單語語料庫之統計藉以輔助小型雙語語料庫中count的不足,進而得到不同的搭配詞。之後,再從雙語語料庫中採用詞彙對應以及雙語詞典資訊,來抽取搭配詞翻譯。經由實驗評估後,驗證本文所提出的方法能有效抽取正確之搭配詞和相對應之翻譯,並可以有效地使用在電腦輔助語言學習以及電腦輔助翻譯等應用上。


    In this paper, we propose a new method for extracting bilingual collocation from a parallel corpus. The method integrates statistical and linguistic information for effective extraction of bilingual collocations. The method involves first obtaining an extended list of distinct English collocations from a very large monolingual corpus, identifying the collocation instances in a parallel corpus, and extracting translation equivalent of the collocations based on word alignment information. At run time, collocations in the parallel corpus are identified and aligned to the translation equivalent. Experimental results show our method is efficient for learning translation memory of collocations. We applied the method to develop a collocational concordancer, TANGO, which showed great potential for applications in Computer Assisted Language Leaning and Computer Assisted Translation.

    摘要 i Abstract ii Table of Content iii Chapter 1 Introduction 1 Chapter 2 Related Work 6 Chapter 3 Extraction of Collocational Translation Memory 9 3.1 Problem Statement 9 3.2 Taggers for Parts of Speech, Chunks, and Clauses 11 3.3 Extraction of Collocation Types in M 14 3.4 Extraction of Collocation Instances in PC 18 3.5 Extracting Collocation Translation Equivalents in Bilingual Corpus 19 Chapter 4 Implementation and Evaluation 20 4.1 Performance of the Chunker and Clauser 20 4.2 Evaluation of the Collocation Extraction 21 4.3 Evaluation of the Collocation Translation 24 Chapter 5 Discussion 28 Chapter 6 Conclusion 31 References 32 Appendix 35 Top 100 vn collocation extracted from BNC 35 Top 100 vnp collocation extracted from BNC 38 Top 100 vpn collocation extracted from BNC 42

    [1] Nagao, M. 1981. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle, in Artificial and Human Intelligence, A. Elithorn and R. Banerji (eds.) North-Holland, pp. 173-180.
    [2] Kitano, H. 1993. A Comprehensive and Practical Model of Memory-Based Machine Translation. Proc.of IJCAI-93. pp. 1276-1282.
    [3] Smadja, Frank 1993 Retrieving collocations from text: Xtract. Computational Linguistics, 19(1): 143-177.
    [4] Lin, Dekang 1998 Extracting collocation from Text corpora. First Workshop on Computational Terminology. pp. 57-63.
    [5] Andriamanankasina, T., Araki, K. and Tochinai, T. 1999. Example-Based Machine Translation of Part-Of-Speech Tagged Sentences by Recursive Division. Proceedings of MT SUMMIT VII. Singapore.
    [6] Carl, M. 1999. Inducing Translation Templates for Example-Based Machine Translation, Proc. of MT Summit VII.
    [7] Brown, R. D. 2000. Automated Generalization of Translation Examples. In Proceedings of the Eighteenth International Conference on Computational Linguistics (COLING-2000), pp. 125-131. Saarbrücken, Germany, August 2000.
    [8] Pearce, Darren 2001 Synonymy in collocation extraction. In Proc. of the NAACL 2001 Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations, CMU.
    [9] Seretan,Violeta; Nerima,Luka; Wehrli, Eric 2003 Extraction of Multi-Word collocations using syntactic bigram composition. International Conference on Recent Advances in NLP. pp. 424-431.
    [10] Transit (http://www.star-group.net/eng/software/sprachtech/transit.html).
    [11] Deja–Vu (http://www.atril.com/).
    [12] TransSearch (http://www.tsrali.com/).
    [13] TOTALrecall(http://candle.cs.nthu.edu.tw/Counter/Counter.asp?funcID=1).
    [14] Smadja, Frank; Mckeown, Kathleen R.; Hatzivassiloglou,Vasileios 1996 Translation collocations for bilingual lexicons: a statistical approach. Computational Linguistics, 22:1-38.
    [15] Gao, Jianfeng; Nie, Jianyun; He, Hongzhao; Chen, Weijun; Zhou, Ming 2000 Resolving query translation ambiguity using a decaying cooccurrence model and syntactic dependence relations. The 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. pp.183 -190.
    [16] Wu,Hua; Zhou, Ming 2003 Synonymous collocation extraction using translation Information. The 4Jth annual conference of the Association for Computational Linguistics. pp. 120-127
    [17] Kupiec, Julian 1993 An algorithm for finding noun phrase correspondences in bilingual corpora. In Proceedings of the 31th Annual Meeting of ACL, pages 17-22.
    [18] Echizen-ya, Hiroshi; Araki, Kenji; Momouchi, Yoshi; Tochinai, Koji 2003 Effectiveness of automatic extraction of bilingual collocations using recursive chain-link-type learning. The 9th Machine Translation Summit. pp.1 02-109.
    [19] Lü, Yajuan, Zhou, Ming 2004. Collocation Translation Acquisition Using Monolingual Corpora. ACL 2004, pp. 167-174.
    [20] CoNLL yearly meeting of the SIGNLL, the Special Interest Group on Natural Language Learning of the Association for Computational Linguistics. The shared task of text chunking in CoNLL-2000 is available at http://cnts.uia.ac.be/conll2000/.
    [21] Melamed, I. D. 1997. A Word-to-Word Model of Translational Equivalence. Proc. of the ACL97. pp 490-497. Madrid Spain, 1997.
    [22] Kilgarriff, Adam; DavidTugwell, ”WORD SKETCH:Extractionand Displayof Significant Collocations for Lexicography”, Proceedings of ACL 2001, 32-38, 2001
    [23] Chien-Cheng Wu, Jason S. Chang, " Bilingual Collocation Extraction Based on Syntactic and Statistical Analyses", Computational Linguistics and Chinese Language Processing Vol. 9, No. 1 , February 2004, pp. 1-20

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE