簡易檢索 / 詳目顯示

研究生: 陳杰暘
Chen, Chieh-Yang
論文名稱: 航空接待員 : 通過高效的大型知識檢索生成任務導向的對話
AirConcierge : Generating Task-Oriented Dialogue via Efficient Large-Scale Knowledge Retrieval
指導教授: 張世杰
Chang, Shih-Chieh
口試委員: 陳縕儂
Chen, Yun-Nung
吳毅成
Wu, I-Chen
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 40
中文關鍵詞: 任務導向對話系統航空對話資料集知識庫提取
外文關鍵詞: Task-oriented dialogue system, AirDialogue Dataset, Knowledge retrieval
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,基於類神經網路方法在任務導向的對話系統中顯示出了卓越的成功,產生任務導向的對話在很大程度上依賴於訪問外部知識庫來檢索與任務相關的信息。然而當開發現實世界的任務導向的對話系統時,通常涉及訪問大型外部知識庫,而這些大型知識庫不能簡單地通過諸如存儲記憶網絡機制之類的類神經網路方法進行編碼。為了緩解上述的問題,在本文中我們提出一個端到端訓練的文本轉換結構化查詢語言引導框架,用以訓練類神經任務導向對話系統能夠用生成的結構化查詢語言與知識庫互動以便獲得資料。具體來說,類神經任務導向對話系統首先學習詢問並確認客戶的意圖,然後動態決定何時將用客戶的需求限制轉換成可執行的結構化查詢語言,藉此從知識庫中獲取相關信息。借助我們的方法,類神經任務導向對話系統不需要將全部知識庫整合進系統,而可以只用少量並更準確的查詢資料結果,有效率的產生有用的對話回覆。我們在 AirDialogue 資料集上評估所的提出的方法,該資料集是 Google 釋出的一個大型任務導向語料庫,其中包含客戶與系統代理預訂機票的對話。實驗表明我們提出的方法在任務準確性和 BLEU 得分方面比之前的模型方法有顯著提高,這不僅顯示我們提出的方法生成的對話有很好的質量,還展示了完成給定任務的能力。


    Despite recent success in neural task-oriented dialogue systems, developing such a real-world system involves accessing large-scale knowledge bases (KBs), which cannot be simply encoded by neural approaches, such as memory network mechanisms. To alleviate the above problem, we propose \airc, an end-to-end trainable text-to-SQL guided framework to learn a neural agent that interacts with KBs using the generated SQL queries. Specifically, the neural agent first learns to ask and confirm the customer's intent during the multi-turn interactions, then dynamically determining when to ground the user constraints into executable SQL queries so as to fetch relevant information from KBs. With the help of our method, the agent can use less but more accurate fetched results to generate useful responses efficiently, instead of incorporating the entire KBs. We evaluate the proposed method on the AirDialogue dataset, a large corpus released by Google, containing the conversations of customers booking flight tickets from the agent. The experimental results show that \airc\ significantly improves over previous work in terms of accuracy and the BLEU score, which demonstrates not only the ability to achieve the given task but also the good quality of the generated dialogues.

    1 Introduction ---- 1 2 Related Work ---- 5 2.1 Task-oriented Dialogue System ---- 5 2.2 Semantic Parsing in SQL ---- 9 3 The Proposed Framework ---- 10 3.1 System Architecture of AirConcierge ---- 10 3.2 Dialogue Encoder ---- 12 3.3 Dialogue State Tracker (Information Gate Module) ---- 13 3.4 SQL Generator ---- 14 3.5 Knowledge Base Memory Encoder ---- 15 3.6 Dialogue Decoder ---- 17 3.7 Dialogue Goal Generator ---- 17 3.8 Objective Function ---- 18 4 Experiments ---- 19 4.1 Dataset ---- 19 4.2 Training Details ---- 20 4.3 Evaluation ---- 21 4.4 Experimental Results: Accuracy ---- 23 4.5 Experimental Results: Scalability ---- 25 4.6 Supplementary ---- 27 5 Conclusions ---- 34 References ---- 35

    [1] A. Bordes, Y.-L. Boureau, and W. Jason. Learning end-to-end goal-oriented dialog.
    In ICLR, 2017.
    [2] J. Chung, C¸ aglar G¨ulc¸ehre, K. Cho, and Y. Bengio. Empirical evaluation of gated
    recurrent neural networks on sequence modeling. ArXiv, abs/1412.3555, 2014.
    [3] R. Das, M. Zaheer, S. Reddy, and A. McCallum. Question answering on knowledge
    bases and text using universal schema and memory networks. In ACL, 2017.
    [4] A. Deoras and R. Sarikaya. Deep belief network based semantic taggers for spoken
    language understanding. In INTERSPEECH, 2013.
    [5] B. Dhingra, L. Li, X. Li, J. Gao, Y.-N. Chen, F. Ahmed, and L. Deng. Towards
    end-to-end reinforcement learning of dialogue agents for information access. In ACL,
    2017.
    [6] E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, and J. Weston. Wizard of wikipedia:
    Knowledge-powered conversational agents. In ICLR, 2019.
    [7] J. Dodge, A. Gane, X. Zhang, A. Bordes, S. Chopra, A. H. Miller, A. Szlam, and
    J. Weston. Evaluating prerequisite qualities for learning end-to-end dialog systems.
    CoRR, abs/1511.06931, 2016.
    [8] M. Eric and C. D. Manning. Key-value retrieval networks for task-oriented dialogue.
    In SIGDIAL, 2017.
    [9] M. Ghazvininejad, C. Brockett, M.-W. Chang, B. Dolan, J. Gao, W.-t. Yih, and
    M. Galley. A knowledge-grounded neural conversation model. In AAAI, 2018.
    [10] W. Hwang, J. Yim, S. Park, and M. Seo. A comprehensive exploration on wikisql
    with table-aware word contextualization. arXiv preprint arXiv:1902.01069, 2019.
    [11] K. Kim, C. Lee, S. Jung, and G. G. Lee. A frame-based probabilistic framework for
    spoken dialog management using dialog examples. In SIGDIAL Workshop, 2008.
    [12] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR,
    abs/1412.6980, 2015.
    [13] W. Lei, X. Jin, Z. Ren, X. He, M.-Y. Kan, and D. Yin. Sequicity: Simplifying
    task-oriented dialogue systems with single sequence-to-sequence architectures. In
    ACL, 2018.
    [14] X. Li, Y.-N. Chen, L. Li, J. Gao, and A. C¸ elikyilmaz. End-to-end task-completion
    neural dialogue systems. ArXiv, abs/1703.01008, 2017.
    [15] B. Liu and I. Lane. An end-to-end trainable neural network model with belief tracking
    for task-oriented dialog. ArXiv, abs/1708.05956, 2017.
    [16] A. Madotto, C.-S. Wu, and P. Fung. Mem2seq: Effectively incorporating knowledge
    bases into end-to-end task-oriented dialog systems. ArXiv, abs/1804.08217, 2018.
    [17] C. D. Manning and M. Eric. A copy-augmented sequence-to-sequence architecture
    gives good performance on task-oriented dialogue. In EACL, 2017.
    [18] B. McCann, N. S. Keskar, C. Xiong, and R. Socher. The natural language decathlon:
    Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018.
    [19] Y. Mo, W. Yin, K. S. Hasan, C. d. Santos, B. Xiang, and B. Zhou. Improved neural
    relation detection for knowledge base question answering. In ACL, 2017.
    [20] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. Devito, Z. Lin, A. Desmaison,
    L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In NIPS-W,
    2017.
    [21] L. Qin, M. Galley, C. Brockett, X. Liu, X. Gao, B. Dolan, Y. Choi, and J. Gao.
    Conversing by reading: Contentful neural conversation with on-demand machine
    reading. In ACL, 2019.
    [22] A. I. Rudnicky, E. H. Thayer, P. C. Constantinides, C. Tchou, R. Shern, K. A. Lenzo,
    W. Xu, and A. H. Oh. Creating natural dialogs in the carnegie mellon communicator
    system. In EUROSPEECH, 1999.
    [23] I. Serban, A. Sordoni, Y. Bengio, A. C. Courville, and J. Pineau. Building end-to-end
    dialogue systems using generative hierarchical neural network models. In AAAI,
    2016.
    [24] S. Sukhbaatar, A. Szlam, J. Weston, and R. Fergus. End-to-end memory networks.
    In NIPS, 2015.
    [25] H. Sun, B. Dhingra, M. Zaheer, K. Mazaitis, R. Salakhutdinov, and W. Cohen. Open
    domain question answering using early fusion of knowledge bases and text. In
    EMNLP, 2018.
    [26] W.Wei, Q. V. Le, A. M. Dai, and J. Li. Airdialogue: An environment for goal-oriented
    dialogue research. In EMNLP, 2018.
    [27] T.-H. Wen, D. V. L. M. Rojas-Barahona, M. Gasic, N. Mrksic, P. hao Su, S. Ultes,
    and S. J. Young. A network-based end-to-end trainable task-oriented dialogue system.
    In EACL, 2016.
    [28] J. Weston, S. Chorpa, and A. Bordes. Memory networks. arXiv:1410.3916, 2014.
    [29] C.-S. Wu, R. Socher, and C. Xiong. Global-to-local memory pointer networks for
    task-oriented dialogue. ArXiv, abs/1901.04713, 2019.
    [30] X. Xu, C. Liu, and D. Song. Sqlnet: Generating structured queries from natural
    language without reinforcement learning. In ICLR, 2018.
    [31] X. Yang, Y.-N. Chen, D. Z. Hakkani-T¨ur, P. Crook, X. Li, J. Gao, and L. Deng.
    End-to-end joint learning of natural language understanding and dialogue manager.
    2017 IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP), pages 5690–5694, 2017.
    [32] S. J. Young, M. Gasic, B. Thomson, and J. D. Williams. Pomdp-based statistical
    spoken dialog systems: A review. Proceedings of the IEEE, 101:1160–1179, 2013.
    [33] T. Yu, Z. Li, Z. Zhang, R. Zhang, and D. Radev. Typesql: Knowledge-based typeaware
    neural text-to-sql generation. In NAACL, 2018.
    [34] T. Yu, R. Zhang, H. Y. Er, S. Li, E. Xue, B. Pang, X. V. Lin, Y. C. Tan, T. Shi, Z. Li,
    Y. Jiang, M. Yasunaga, S. Shim, T. Chen, A. R. Fabbri, Z. Li, L. Chen, Y. Zhang,
    S. Dixit, V. Zhang, C. Xiong, R. Socher, W. S. Lasecki, and D. R. Radev. Cosql: A
    conversational text-to-sql challenge towards cross-domain natural language interfaces
    to databases. In EMNLP/IJCNLP, 2019.
    [35] T. Yu, R. Zhang, K. Yang, M. Yasunaga, D. Wang, Z. Li, J. Ma, I. Li, Q. Yao,
    S. Roman, Z. Zhang, and D. R. Radev. Spider: A large-scale human-labeled dataset
    for complex and cross-domain semantic parsing and text-to-sql task. In EMNLP,
    2018.
    [36] T. Yu, R. Zhang, M. Yasunaga, Y. C. Tan, X. V. Lin, S. Li, H. Er, I. Li, B. Pang,
    T. Chen, E. Ji, S. Dixit, D. N. Proctor, S. Shim, J. Kraft, V. Zhang, C. Xiong, R. Socher,
    and D. R. Radev. Sparc: Cross-domain semantic parsing in context. In ACL, 2019.
    [37] T. Zhao and M. Esk´enazi. Towards end-to-end learning for dialog state tracking and
    management using deep reinforcement learning. In SIGDIAL Conference, 2016.
    [38] V. Zhong, C. Xiong, and R. Socher. Seq2sql: Generating structured queries from
    natural language using reinforcement learning. ArXiv, abs/1709.00103, 2017.
    [39] V. Zue. Conversational interfaces: advances and challenges. Proceedings of the
    IEEE, 88:1166–1180, 2000.
    [40] V. Zue, S. Seneff, J. R. Glass, J. Polifroni, C. Pao, T. J. Hazen, and I. L. Hetherington.
    Juplter: a telephone-based conversational interface for weather information. IEEE
    Trans. Speech Audio Process., 8:85–96, 2000.

    QR CODE