研究生: |
吳東澤 Wu, Dong-Ze |
---|---|
論文名稱: |
外部分詞知識強化下基於白話文的古詩生成 Classical Chinese Poetry Generation from Vernacular Chinese: A Word-Enhanced Supervised Approach |
指導教授: |
陳良弼
Chen, Arbee L.P. |
口試委員: |
沈之涯
Shen, Chih-Ya 簡仁宗 Chien, Jen-Tzung 范耀中 Fan, Yao-Chung |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 30 |
中文關鍵詞: | 白話文 、古詩 、深度學習 、文本生成 、中文分詞 |
外文關鍵詞: | Vernacular Chinese, Classical Chinese Poetry, Deep Learning, Text Generation, Chinese Word Segmention |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,隨著深度學習的迅速發展,自然語言處理在許多方面取得了巨大的進步。在文本生成領域,中文古典詩歌作為中華文化的重要組成部分,也日益受到重視。然而,現有基於神經網路對中文古典詩歌生成所進行的研究忽略了中文詞語中蘊含的語義資訊。中文句子是一個沒有空格的漢字序列,能否正確地分割句子對於正確理解原文是非常重要的。因此,如果模型知道如何分割句子,就可以更好地理解句子的意思。
本文提出了一種新的基於白話文的古詩生成模型WE-Transformer(Word-Enhanced Transformer),該模型能夠結合外部中文分詞知識。我們的模型透過雙向LSTM學習基於字嵌入的詞語義,並透過額外的詞編碼器提高Transformer生成的中文古典詩的質量。
經過實驗的自動和人工評估,可以證明,與Baseline和最先進的模型相比,我們的方法能夠帶來更優越的模型表現。
In recent years, with the rapid development of deep learning, natural language processing has made great progress in many aspects. In the field of text generation, classical Chinese poetry, as an important part of Chinese culture, has also attached growing attention. However, the existing researches on neural-network-based classical Chinese poetry generation ignore the semantics contained in Chinese words. A sentence in Chinese is a sequence of characters without spaces, and it is important to segment the sentence properly for understanding the original text correctly. Therefore, if the model knows how to segment the sentence, it can better understand the meaning of the sentence.
In this paper, we propose a novel model, namely WE-Transformer (Word-Enhanced Transformer), to generate classical Chinese poetry from vernacular Chinese, which incorporates external Chinese word segmentation knowledge. Our model learns word semantics based on character embeddings by bidirectional LSTM and enhances the quality of generated classical poems based on the Transformer with extra word encoders.
Compared to the baselines and state-of-the-art models, our experiments on automatic and human evaluations have demonstrated that our method can bring better performance.
[1] Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, and Enhong Chen. Chinese poetry generation with planning based neural network. arXiv preprint arXiv:1610.09889, 2016.
[2] Zhichao Yang, Pengshan Cai, Yansong Feng, Fei Li, Weijiang Feng, ElenaSuet-Ying Chiu, and Hong Yu. Generating classical chinese poems from vernacular chinese. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, volume 2019, page 6155. NIH Public Access, 2019.
[3] Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc'Aurelio Ranzato. Phrase-based & neural unsupervised machine translation. arXiv preprint arXiv:1804.07755, 2018.
[4] Jason PC Chiu and Eric Nichols. Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics, 4:357–370, 2016.
[5] Xinxiong Chen, Lei Xu, Zhiyuan Liu, Maosong Sun, and Huanbo Luan. Joint learning of character and word embeddings. In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.
[6] Yue Zhang and Jie Yang. Chinese ner using lattice lstm. arXiv preprint arXiv:1805.02023, 2018.
[7] Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang. ZEN: Pre-training Chinese text encoder enhanced by n-gram representations. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4729–4740, Online, November 2020. Association for Computational Linguistics.
[8] Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, and Jiwei Li. Is word segmentation necessary for deep learning of chinese representations? arXiv preprint arXiv:1905.05526, 2019.
[9] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
[10] Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237, 2019.
[11] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
[12] Xinsong Zhang and Hang Li. Ambert: A pre-trained language model with multi-grained tokenization. arXiv preprint arXiv:2008.11869, 2020.
[13] Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223, 2019.
[14] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, and Guoping Hu. Pretraining with whole word masking for chinese bert. arXiv
preprint arXiv:1906.08101, 2019.
[15] Jing He, Ming Zhou, and Long Jiang. Generating chinese classical poems with statistical machine translation models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 26, 2012.
[16] Rui Yan, Han Jiang, Mirella Lapata, Shou-De Lin, Xueqiang Lv, and Xiaoming Li. I, poet: automatic chinese poetry composition through a generative summarization framework under constrained optimization. In Twenty-Third International Joint Conference on Artificial Intelligence, 2013.
[17] Xingxing Zhang and Mirella Lapata. Chinese poetry generation with recurrent neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 670–680, 2014.
[18] Xiaoyuan Yi, Maosong Sun, Ruoyu Li, and Zonghan Yang. Chinese poetry generation with a working memory model. arXiv preprint arXiv:1809.04306, 2018.
[19] Xiaopeng Yang, Xiaowen Lin, Shunda Suo, and Ming Li. Generating thematic chinese poetry using conditional variational autoencoders with hybrid decoders. arXiv preprint arXiv:1711.07632, 2017.
[20] Xiaoyuan Yi, Ruoyu Li, and Maosong Sun. Chinese poetry generation with a salient-clue mechanism. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 241–250, Brussels, Belgium, October 2018. Association for Computational Linguistics.
[21] Xiaoyuan Yi, Ruoyu Li, Cheng Yang, Wenhao Li, and Maosong Sun. Mixpoet: Diverse poetry generation via learning controllable mixed latent space. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9450–9457, 2020.
[22] Piji Li, Haisong Zhang, Xiaojiang Liu, and Shuming Shi. Rigid formats controlled text generation. arXiv preprint arXiv:2004.08022, 2020.
[23] Rui Yan. i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. In IJCAI, pages 2238–2244, 2016.
[24] Liming Deng, Jie Wang, Hangming Liang, Hui Chen, Zhiqiang Xie, Bojin Zhuang, Shaojun Wang, and Jing Xiao. An iterative polishing framework based on quality aware masked language model for chinese poetry generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7643–7650, 2020.
[25] Wanxiang Che, Yunlong Feng, Libo Qin, and Ting Liu. N-ltp: A open-source neural chinese language technology platform with pretrained models. arXiv preprint arXiv:2009.11616, 2020.
[26] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
[27] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. arXiv preprint arXiv:1706.03762, 2017.
[28] Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393, 2016.
[29] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[30] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.