研究生: |
陳冠衣 Chen, Guan-Yi |
---|---|
論文名稱: |
在虛擬戲劇中根據角色的恐懼情緒學習審問的策略 Learning Interrogation Strategies Based on Fear Emotion in Virtual Drama Dialogue |
指導教授: | 蘇豐文 |
口試委員: |
朱宏國
劉瑞瓏 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 英文 |
論文頁數: | 55 |
中文關鍵詞: | 虛擬戲劇對話 、智慧型代理人 、審問策略 、增強式學習法 、恐懼情緒模擬 |
外文關鍵詞: | Virtual drama dialogue, Agent, Interrogation strategies, Reinforcement learning, Fear emotion simulation |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
對話系統已經發展了好幾年,隨著對話系統的普及,對話策略也受到越來越多的關注。然而,要選擇一個適當的對話策略並不是一件容易的事。在本篇論文中,我們提出使用增強式學習方式產生審問對話的策略。我們的第一個貢獻是描述一個增強式學習的框架來產生審問對話的策略,而第二個貢獻是將代理人的社會情境及情緒狀態做為選擇對話策略的考量。為了模擬及實驗,我們根據一部小說的一段故事做為情境建立世界的背景知識。特別的是,我們透過來模擬嫌疑人的情緒變化,而警長根據這些情緒變化產生不同的對話策略。而我們得到的結果是不僅警長很有效率的偵測到嫌疑犯說謊的時間點,並且可以因此得到更多正確的資訊。
Dialogue systems have been developed for several years. As dialogue systems become ubiquitous, dialogue strategies of virtual agents are receiving more and more attention. However, to know how to select a proper dialogue in a specific social context is not a trivial task since the world is complex. In this thesis, we propose reinforcement learning to learn the strategy of “interrogation dialogue” in virtual drama. Our first contribution is describing a new reinforcement learning framework that can learn dialogue strategies from the interrogation dialogue. The second contribution is bringing the social context and emotion states of agents into the dialogue strategies. In order to demonstrate and simulate the performance, we based on a scenario from a detective novel to build the background knowledge of the world. In particular, we model the emotion variations of a suspect using a generation function of human emotion based on psychological literature so that the detective can learn the dialogue strategies based on the suspect emotion context. And the result of the learned dialogue policy is very sensitive in detecting lying of a suspect, and the superintendent gets more correct answer.
[1] T. M. Mitchell and M. Hill, Machine Learning, ch. 13, 1997.
[2] E. Selfridge and P. A. Heeman, “Learing Turn, Attention, and Utterance Decidions in a Negotiative Slot-Filling Domain,” Oregon Health & Science University: Center for Spoken Language Understanding, Technical Report CSLU-11-005, 2011.
[3] F. Jurčíček, B. Thomson, and S. Young, “Reinforcement learning for parameter estimation in statistical spoken dialogue systems,” Computer Speech and Language, 26(3):168–192.
[4] T. Misu1, K. Georgila, A. Leuski and D. Traum, “Reinforcement Learning of Question-Answering Dialogue Policies for Virtual Museum Guides,” in SIGdial 2012, the 13th Annual SIGdial meeting on Discourse and Dialogue, Seoul, South Korea, 2012.
[5] M. L. Puterman, “Markov decision processes: Discrete stochastic dynamic programming,” John Wiley & Sons, Inc., 2005.
[6] L. P. Kaelbling, M. L. Littman and A. W. Moore, “Reinforcement Learning: A Survey,” J. of Artificial Intelligence Research, 4, 237-285, 1996.
[7] S. Russell, and P. Norvig, Artificial Intelligence: A Modern Approach. Pearson Education, 2003.
[8] R. Sutton and A. Barto, Reinforcement Learning. MIT Press, Cambridge MA, 1998.
[9] C.J.C.H. Watkins, Learning from delayed rewards. PhD Thesis, University of Cambridge, England, 1989.
[10] C. J. C. H. Watkins and P. Dayan, Q-learning. Machine Learning, 8 (3), 279-292, 1992.
[11] S. Larsson and D. Traum, “Information state and dialogue managment in the trindi dialogue move engine toolkit,” Natural Language Engineering, vol. 6, pp. 323–340, 2000.
[12] A.W. Biermann and P. M. Long, “The composition of messages in speech-graphics interactive systems,” in Proc. of the 1996 International Symposium on Spoken Dialogue, pages 97–100, 1996.
[13] E. Levin, R. Pieraccini, and W. Eckert, “Learning dialogue strategies within themarkov decision process framework,” in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, 1997.
[14] M. A. Walker, J. C. Fromer, and S. Narayanan, “Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email,” in Proc. of the 36th Annual Meeting of the Association of Computational Linguistics, COLING/ACL 98, pp. 1345-1352, 1998.
[15] S. Singh, M. S. Kearns, D. J. Litman, and M. A. Walker, “Reinforcement learning for spoken dialogue systems,” in Proc. NIPS99, 1999.
[16] D. J. Litman, M. S. Kearns, S. B. Singh, and M. A. Walker, “Automatic optimization of dialogue management,” in Proc. of the 18th international conference on computational linguistics, Saarbrucken, Luxembourg, Nancy, July, 2000.
[17] M. A. Walker, “An application of reinforcement learning to dialog strategy selection in a spoken dialogue system for email,” J. of Artificial Intelligence Research, 12:387–416, 2000.
[18] E. Levin, R. Pieraccini, and W. Eckert, “A stochastic model of human-machine interaction for learning dialog strategies,” in Proc. of IEEE Transactions on Speech and Audio, 8(1):11–23, 2000.
[19] H. Cuayáhuitl, “Learning Dialogue Agents with Bayesian Relational State Representations,” in Proc. of the IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems (IJCAI-KRPDS), Pages 9-15, Barcelona, Spain, IJCAI, 7/2011.
[20] P. A. Heeman, “Combining Reinforcement Learning with Information-State Update Rules,” in Proc. of the North American Chapter of the Association for Computational Linguistics Annual Meeting, pages 268-275, Rochester NY, April 2007.
[21] K. Scheffler and S. Young, “Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning,” in Proc. of the second international Conf. Human Language Technology Research, pp. 12–19, 2002.
[22] K. Georgila, J. Henderson, and O. Lemon, “Learning user simulations for information state update dialogue systems.,” in Eurospeech, Lisbon Portugal, 2005.
[23] P. A. Heeman, “Representing the reinforcement learning state in a negotiation dialogue,” in Proc. of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Merano, Italy, 2009.
[24] K. Georgila and D. Traum, “Learning culture-specific dialogue models from non cultures pecific data,” in Proc. of HCI International, Lecture Notes in Computer Science Vol. 6766, pages 440–449, Orlando, FL, USA, 2011a.
[25] K. Georgila and D. Traum, “Reinforcement learning of argumentation dialogue policies in negotiation,” in Proc. of Interspeech, pages 2073– 2076, Florence, Italy, 2011b.
[26] R. op den Akker, H. Bunt, S. Keizer, and B. van Schooten, “From question answering to spoken dialogue: Towards an information search assistant for interactive multimodal information extraction,” in Proc. of Interspeech, pages 2793–2796, Lisbon, Portugal, 2005.
[27] S. Varges, F. Weng, and H. Pon-Barry, “Interactive question answering and constraint relexation in spoken dialogue systems,” Natural Language Engineering, 15(1):9–30, 2009.
[28] J. Garey, A. Goodwillie, J. Frohlich, M. Morgan, J.-A. Gustafsson, O. Smithies, K. S. Korach , S. Ogawa and D. W. Pfaff, “Genetic contributions to generalized arousal of brain and behavior,” in Proc Natl Acad Sci, USA, 100:11019–11022, 2003.
[29] R. M. Yerkes and J. D. Dodson, “The relation of strength of stimulus to rapidity of habit-formation,” J. of Comparative Neurology and Psychology, 18: 459–482, 1908.
[30] J. J. Sherwood, “A Relation Between Arousal And Performance,” The American Journal of Psychology, 78, 461-465, 1965.
[31] P. Verduyn, I. V. Mechelen and F. Tuerlinckx, “The relation between event processing and the duration of emotional experience,” Emotion, 11, 20-28, 2011.
[32] P. Verduyn, E. Delvaux, H. V. Coillie, F. Tuerlinckx and I. V. Mechelen, “Predicting the duration of emotional experience: Two experience sampling studies.” Emotion, 9, 83-91, 2009.
[33] T. Avinadav and T. Raz, “A New Inverted U-Shape Hazard Function,” IEEE Tras. Rellability, vol. 57, no. 1, March, 2008.