研究生: |
彭成全 Peng, Cheng-Quan |
---|---|
論文名稱: |
語法剖析器生成中文的文法規則 Extracting Chinese Lexical Grammar Patterns Using Dependency Parsing |
指導教授: |
張俊盛
Chang, Jason S. |
口試委員: |
陳浩然
Chen, Hao-Jan 馬偉雲 Ma, Wei-Yun |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 37 |
中文關鍵詞: | 句法關係 、相依關係 、文法規則 、輔助寫作系統 、搭配詞 |
外文關鍵詞: | Grammatical Relations, Dependency, Grammatical Patterns, Computer Assisted Writing System, Collocations |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究呈現互動式寫作的雛形系統「SmartWrite」,能從大量的中文語料,自動擷取中文動詞的文法規則、搭配詞、例句,輔助使用者學習,以提供華語教學更多資源。本方法利用大量的語料庫以擷取中文動詞的文法規則。其方法為利用句法關係剖析器,把各動詞的相依關係結合,來組合成文法規則,再選擇動詞的適當的搭配詞與例句。系統執行時,使用者輸入的最後一個字詞,將被系統擷取,接著輸出字詞的文法規則、搭配詞與例句,以提供使用者即時查詢。本研究評估方式是從一組動詞進行人工評估,再呈現這些動詞的評估結果。
This thesis presents an interactive writing system, SmartWrite, which provides word usages with grammar patterns, collocations, and examples to assist learners in writing. We propose a method of inducing common grammar patterns from large-scale Chinese corpora. We use a dependency parser to extract grammatical relations. After that, we calculate words in a dependency relation with each headword to generate the collocations. Finally, we sample sentences that can exemplify our patterns for each headword. At run-time, the last word is identified to its grammar patterns, collocations, and example sentences as writing hints. Evaluation for a set of verbs shows our method can provide reasonable results.
Vit Baisa and Vit Suchornel. Skell: Web interface for english language learning. In
Eighth Workshop on Recent Advances in Slavonic Natural Language Processing,
pages 63-70, 2014.
Alex Boulton and Tom Cobb. Corpus use in language learning: A meta-analysis.
Language Learning, 67(2):348-393, 2017.
Pi-Chuan Chang, Huihsin Tseng, Daniel Jurafsky, and Christopher D. Man-
ning. Discriminative reordering with Chinese grammatical relations features.
In SSST@HLT-NAACL, 2009.
Wanxiang Che, Zhenghua Li, and Ting Liu. Ltp: A chinese language technology
platform. In COLING, 2010.
Wei-Te Chen, Su-Chu Lin, Shu-Ling Huang, You-Shan Chung, and Keh-Jiann
Chen. E-hownet and automatic construction of a lexical ontology. In COLING,
2010.
Lynne Flowerdew. Applying corpus linguistics to pedagogy. International Journal
of Corpus Linguistics, 14(3):393—417, 2009.
Andrew Hardie. Cqpweb — combining power, flexibility and usability in a corpus analysis tool. 2013.
Renfen Hu, Jiayong Chen, and Kuang hua Chen. The construction of a chinese collocational knowledge resource and its application for second language acqui-
sition. In COLING, 2016.
Chu-Ren Huang, Adam Kilgarriff, Yi-Ching Wu, Chill-Ming Chin, Simon Smith,
Pavel Rychly, Ming-Hong Bai, and Keh-Jiann Chen. Chinese sketch engine and
the extraction of grammatical collocations. In SIGHAN@IJCNLP 2005, 2005.
Tim Johns. Should you be persuaded: Two samples of data-driven learning ma-
terials. 1991.
Adam Kilgarriff, Pavel Rychly, Pavel Smrz, and David Tugwell. The sketch engine.
In Proceedings of EURALEX, 2004.
Adam Kilgarriff, Vit Baisa, Jan Busta, Milos Jakubicek, Vojtech Kovar, Jan Michelfeit, Pavel Rychly, and Vit Suchomel. The sketch engine: ten years
on. Lexicography, 2014.
Roger Levy and Christopher D. Manning. Is it harder to parse chinese, or the chinese treebank? In ACL, 2003.
Wei-Yun Ma and Keh-Jiann Chen. A bottom-up merging algorithm for chinese
unknown word extraction. In SIGHAN, 2003.
P Rychly. A lexicographer-friendly association score. In Proceedings of Recent
Advances in Slavonic Natural Language Processing, 2008.
Shaoqun Wu and Ian Witten. Transcending concordance: Augmenting academic
text for l2 writing. International Journal of Computer-Assisted Language Learn-
ing and Teaching (IJCALLT), 6(2):1-18, 2016.
Fei Xia. The part-of-speech tagging guidelines for the penn chinese treebank (3.0. Technical report, Linguistic Data Consortium, 2000.
Ruifeng Xu, Qin Lu, Kam-Fai Wong, and Wenjie Li. Building a chinese collocation
bank. Int. J. Comput. Proc. Oriental Lang., 22:21-47, 2009.
Naiwen Xue, Fei Xia, Fu-Dong Chiou, and Martha Palmer. The penn chinese
treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 11:207-238, 2005.
Tzu-Hsi Yen, Jian-Cheng Wu, Jim Chang, Joanne Boisson, and Jason S. Chang.
Writeahead: Mining grammar patterns in corpora for assisted writing. In ACL,
2015.