基於文字之室內導航系統—以對話系統實現｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	張仲彣 Tschang, Jong-Wen
論文名稱：	基於文字之室內導航系統—以對話系統實現 An Indoor Navigation Dialogue System
指導教授：	許聞廉 Hsu, Wen-Lian
口試委員:	王昭能 Wang, Chao-Neng 盧錦隆 Lu, Chin Lung
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2022
畢業學年度：	110
語文別：	中文
論文頁數：	51
中文關鍵詞：	室內導航、對話系統、語言理解、語言生成、聊天機器人
外文關鍵詞：	indoor navigation, Chatbot, dialogue system, language understanding, language generation
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

現今大多數人已經習慣擁有導航系統指引路徑。透過智慧型手機的 APP
服務，能夠立即找到自己所身處之位置，並立即計算出所要到達之目的地
的路徑。室外導航以 Google Map 以及 Apple Map 兩地圖軟體為大宗。然而
室外導航所依賴的 GPS 系統因其偏差值以及只能定位二維位置的問題，在
室內卻顯得無用武之地。因此若期望在不增加感測儀或者無線訊號等技術
之成本下達到室內導航，只利用問路與指路的方式來導航，需要一套系統
能夠理解文字所描述之位置、鄰近地標與欲前往之目的地。而為達成文字
的理解則需要仰賴自然語言技術的輔助。若系統擁有足夠的中文理解能力
以及給予精確之指引的能力，便能在不依賴硬體等設備的情況下，同樣給
予良好的導航體驗。其優勢能捨去硬體建置以及維護之成本。
本論文提出一「文字室內導航對話系統」之手機應用程式。首先，透
過準則式統計方法結合簡化法，擷取出每個句子的核心概念，生成對話中
不同意圖的語意模板。接著以語意模板組成的理解核心搭配意圖修正器理
解使用者輸入句之意圖以及關鍵字元，分析出語句所描述的起點與終點等
資訊。最後生成出導航引導語句，進而達到精確之指引功能。
本研究以 1200 筆多輪之導航對話。實驗指出理解核心搭配意圖修正器
的正確率可達 98.1%；而回覆句生成之正確率可達 95.3%。此外，本研究請
了 23 位受測者實際在真實場域內操作行動應用程式，並提出反饋。在 SUS
易用性矩陣量表中獲得 72.25 分，高於易用性標準之 68 分。以上皆反映出
本系統在文字對話導航之有效性與突破性進展。

Nowadays, most people are used to using a navigation system to guide their way. With a smartphone app, you can immediately find your location and calculate the path to your destination. Google Map and Apple Map are the two most popular map software for outdoor navigation. However, since GPS system has positioning deviation and it can only locate two-dimensional positions, GPS system is useless indoors. Therefore, if we want to achieve indoor navigation without the cost of additional sensor or wireless signal technology , we need a system that can understand the textual description of locations , the nearby landmarks, and point out the path just like human being in the end. If the system has sufficient ability of Chinese language comprehension and giving precise guidance, it can provide a good navigation experience without relying on hardware and other equipment. The advantage is that it eliminates the cost of hardware construction and maintenance.
We propose a mobile application of ”textual indoor navigation dialogue system”. Firstly, we extract the core concept of each sentence by using statistical principle based approach combined with reduction to generate semantic templates
of different intents in conversations.Then we use the semantic templates of comprehension core composed with intent modifier to understand the intent of input sentence, extract keywords and analyze information such as the starting and ending locations. Finally, the system will generate navigation guidance sentences to achieve precise guidance.
In this study, 1200 multi-round navigation conversations were collected. The experiments showed that the comprehension core with intent modifier could achieve accuracy of 98.1%, and the response sentence generation could achieve a correct rate of 95.3%. In addition, 23 participants operate the mobile application in a real world environment and provide feedback. The score of SUS(System Usability Scare) was 72.25 which is higher than standard score 68. All of these reflect the effectiveness and breakthrough of the system.

摘要 i
Abstract iii
誌謝 v
緒論 1
1 研究背景 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 人工智慧 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 對話系統 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 室內導航的困難點 . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 研究主題 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 研究貢獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
文獻探討 5
1 自然語言語意理解 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 N-gram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 詞向量 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 自然語言文字生成 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 句法結構式文本生成 . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 機率與深度學習文本生成 . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 模板式文本生成 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 A* 最短路徑搜尋演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 對話機器人 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.1 對話系統的歷史背景 . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 規則式對話系統 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 類神經網路對話系統 . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 工具介紹 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.1 本體論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2 InfoMap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.3 統計準則式機器學習 . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6 室內導航 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.1 基於無線信號 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2 基於感測器 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.3 基於視覺影像處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.4 基於文字 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
研究方法與設計 17
1 系統流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 語意模板 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1 資料集分類 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 詞語意標注 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 生成模板 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 意圖抽取 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 理解核心 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 以模板實現理解 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 意圖修正器 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 槽位設計與槽填充 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5 回覆文字生成 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1 模板選擇 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 以模板實現生成 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 室內導航情境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.1 導航點與導航物件 . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.2 路徑導航演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3 候選物件檢索 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.4 導航資料生成 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7 商品搜尋情境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.1 商品與導航物件 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.2 商品檢索方式 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
實驗結果與討論 35
1 訓練資料集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.1 語意模板資料集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.2 意圖修正器之訓練資料集 . . . . . . . . . . . . . . . . . . . . . . . 35
2 測試資料集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 評估方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1 意圖抽取之評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 回覆意圖選擇之評估 . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 使用者測試 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 成效評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 意圖抽取之成效評估 . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 回覆意圖選擇之成效評估 . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 使用者測試分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
結論與未來展望 45
1 結論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2 未來展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
References 47
                                

[1] “Word2vec — image visualized thorugh principal component analysis.” https://ithelp.
ithome.com.tw/articles/10243725, 2020.
[2] J. Vig, “A multiscale visualization of attention in the transformer model,” in Proceedings
of the 57th Annual Meeting of the Association for Computational Linguistics: System
Demonstrations, (Florence, Italy), pp. 37–42, Association for Computational Linguistics,
July 2019.
[3] “Dialogue system types and comparision.” http://www.lempinenpartners.com/
what-are-chatbots-and-how-they-impact-service-management/. 2018.
[4] A. Bangor, P. Kortum, and J. Miller, “Determining what individual sus scores mean:
Adding an adjective rating scale,” J. Usability Studies, vol. 4, p. 114–123, may 2009.
[5] D. Silver, A. Huang, C. Maddison, et al., “Mastering the game of go with deep neural
networks and tree search,” Nature 529, p. 484–489, 2016.
[6] N. Tiku, “The google engineer who thinks the company＇s ai has come
to life.” https://www.washingtonpost.com/technology/2022/06/11/
google-ai-lamda-blake-lemoine/, 2022.
[7] Wikipedia contributors, “Global positioning system — Wikipedia, the free encyclo-
pedia.” https://en.wikipedia.org/w/index.php?title=Global_Positioning_System&
oldid=1101779258, 2022. [Online; accessed 9-August-2022].
[8] Eve Daniel, “藍牙/uwb/wi-fi 各擁絕活微定位技術三強較勁.” https://www.2cm.com.
tw/2cm/zh-tw/tech/2D574745F4CA40319E0909405F87C562, 2022.
[9] Google Cloud Team, “什麼是自然語言處理？.” https://cloud.google.com/learn/
what-is-natural-language-processing?hl=zh-tw, 2022.
[10] D. Hakkani-Tür, G. Tur, A. Celikyilmaz, Y.-N. V. Chen, J. Gao, L. Deng, and Y.-Y. Wang,
“Multi-domain joint semantic frame parsing using bi-directional rnn-lstm,” in Proceedings
of The 17th Annual Meeting of the International Speech Communication Association (IN-
TERSPEECH 2016), ISCA, June 2016.
[11] D. Jurafsky and J. H. Martin., “N-gram language models,” Stanford University, 2021.
[12] G. E. Hinton et al., “Learning distributed representations of concepts,” in Proceedings of
the eighth annual conference of the cognitive science society, vol. 1, p. 12, Amherst, MA,
1986.
[13] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word represen-
tation,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language
Processing (EMNLP), pp. 1532–1543, Oct. 2014.
[14] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representa-
tions in vector space,” 2013.
[15] A. Mnih and K. Kavukcuoglu, “Learning word embeddings efficiently with noise-
contrastive estimation,” in Proceedings of the 26th International Conference on Neural
Information Processing Systems - Volume 2, NIPS’13, (Red Hook, NY, USA), p. 2265–
2273, Curran Associates Inc., 2013.
[16] R. Bro and A. K. Smilde, “Principal component analysis,” Anal. Methods, vol. 6,
pp. 2812–2831, 2014.
[17] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirec-
tional transformers for language understanding,” 2018.
[18] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and
I. Polosukhin, “Attention is all you need,” 2017.
[19] J. Vig, “A multiscale visualization of attention in the transformer model,” in Proceedings
of the 57th Annual Meeting of the Association for Computational Linguistics: System
Demonstrations, (Florence, Italy), pp. 37–42, Association for Computational Linguistics,
July 2019.
[20] B. Zhou, Y. Gao, J. Sorensen, Z. Diao, and M. Picheny, “Statistical natural language gen-
eration for speech-to-speech machine translation systems,” in ICSLP-2002:Inter. Conf. on
Spoken Language Processing, vol. 3, pp. 1897-1900, 09 2002.
[21] G. A. Miller, “Wordnet: A lexical database for english,” Commun. ACM, vol. 38, p. 39–
41, nov 1995.
[22] R. Chang, J. Chang, H.-J. Chen, and H.-C. Liou, “An automatic collocation writing as-
sistant for taiwanese efl learners: A case of corpus-based nlp technology,” Computer As-
sisted Language Learning - COMPUT ASSIST LANG LEARN, vol. 21, pp. 283–299, 07
2008.
[23] I. Konstas and M. Lapata, “Inducing document plans for concept-to-text generation,” in
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Process-
ing, (Seattle, Washington, USA), pp. 1503–1514, Association for Computational Linguis-
tics, Oct. 2013.
[24] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural net-
works,” in Proceedings of the 27th International Conference on Neural Information Pro-
cessing Systems - Volume 2, NIPS’14, (Cambridge, MA, USA), p. 3104–3112, MIT Press,
2014.
[25] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent neural
networks,” 2012.
[26] X. Zhang, Y. Yang, S. Yuan, D. Shen, and L. Carin, “Syntax-infused variational autoen-
coder for text generation,” Proceedings of the 57th Annual Meeting of the Association for
Computational Linguistics, pp. 2069–2078, 2019.
[27] T. Becker, “Practical, template–based natural language generation with TAG,” in Pro-
ceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related
Frameworks (TAG+6), (Universitá di Venezia), pp. 80–83, Association for Computational
Linguistics, May 2002.
[28] E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numerische mathe-
matik, vol. 1, no. 1, pp. 269–271, 1959.
[29] P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination
of minimum cost paths,” IEEE Transactions on Systems Science and Cybernetics, vol. 4,
no. 2, pp. 100–107, 1968.
[30] A. M. TURING, “I.—COMPUTING MACHINERY AND INTELLIGENCE,” Mind,
vol. LIX, pp. 433–460, 10 1950.
[31] J. Weizenbaum, “Eliza—a computer program for the study of natural language communi-
cation between man and machine,” Commun. ACM, vol. 9, p. 36–45, jan 1966.
[32] D. A. Ferrucci, “Introduction to ”this is watson”,” IBM J. Res. Dev., vol. 56, p. 235–249,
may 2012.
[33] Wikipedia contributors, “Siri — Wikipedia, the free encyclopedia.” https://en.
wikipedia.org/w/index.php?title=Siri&oldid=1102641327, 2022. [Online; accessed 9-
August-2022].
[34] Wikipedia contributors, “Google now — Wikipedia, the free encyclopedia.” https://en.
wikipedia.org/w/index.php?title=Google_Now&oldid=1057019621, 2021. [Online;
accessed 9-August-2022].
[35] S. Bao, H. He, F. Wang, H. Wu, and H. Wang, “Plato: Pre-trained dialogue generation
model with discrete latent variable,” 2019.
[36] D. Adiwardana, M.-T. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kul-
shreshtha, G. Nemade, Y. Lu, and Q. V. Le, “Towards a human-like open-domain chat-
bot,” 2020.
[37] M. Lynley, “Google unveils google assistant, a virtual assistant that＇s a big upgrade to
google now.,” TechCrunch, 2016.
[38] K. M. Colby, S. Weber, and F. D. Hilf, “Artificial paranoia,” Artificial Intelligence, vol. 2,
no. 1, pp. 1–25, 1971.
[39] H. Shah, “A.l.i.c.e.: an ace in digitaland,” tripleC: Communication, Capitalism & Critique.
Open Access Journal for a Global Sustainable Information Society, vol. 4, pp. 284–292,
2006.
[40] J. Y. Lee and F. Dernoncourt, “Sequential short-text classification with recurrent and
convolutional neural networks,” in Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, (San Diego, California), pp. 515–520, Association for Computational Lin-
guistics, June 2016.
[41] C.-S. Wu, S. C. Hoi, R. Socher, and C. Xiong, “TOD-BERT: Pre-trained natural lan-
guage understanding for task-oriented dialogue,” in Proceedings of the 2020 Conference
on Empirical Methods in Natural Language Processing (EMNLP), (Online), pp. 917–929,
Association for Computational Linguistics, Nov. 2020.
[42] Wikipedia contributors, “Dialogflow — Wikipedia, the free encyclopedia,” 2022. [Online;
accessed 9-August-2022].
[43] T. R. Gruber, “Toward principles for the design of ontologies used for knowledge shar-
ing?,” International Journal of Human-Computer Studies, vol. 43, no. 5, pp. 907–928,
1995.
[44] W.-L. Hsu, S.-H. Wu, and Y.-S. Chen, “Event identification based on the information
map-infomap,” 2001 IEEE International Conference on Systems, Man and Cybernetics. e-
Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236), pp. 1661–1666
vol.3, 2001.
[45] P.-T. Lai, M.-S. Huang, T.-H. Yang, W.-L. Hsu, and R. T.-H. Tsai, “Statistical principle-
based approach for gene and protein related object recognition,” Journal of Cheminfor-
matics, vol. 10, p. 64, Dec 2018.
[46] H.-J. Dai, C.-K. Wang, N.-W. Chang, M.-S. Huang, J. Jonnagaddala, F.-D. Wang, and W.-
L. Hsu, “Statistical principle-based approach for recognizing and normalizing microRNAs
described in scientific literature,” Database, vol. 2019, 02 2019. baz030.
[47] T.-H. Yang, Y.-L. Hsieh, S.-H. Liu, Y.-C. Chang, and W.-L. Hsu, “A flexible template
generation and matching method with applications for publication reference metadata ex-
traction,” Journal of the Association for Information Science and Technology, vol. 72,
no. 1, pp. 32–45, 2021.
[48] A. Satan, “Bluetooth-based indoor navigation mobile system,” in 2018 19th International
Carpathian Control Conference (ICCC), pp. 332–337, 2018.
[49] D. Han, S. Jung, M. Lee, and G. Yoon, “Building a practical wi-fi-based indoor navigation
system,” IEEE Pervasive Computing, vol. 13, no. 2, pp. 72–79, 2014.
[50] C. Tsirmpas, A. Rompas, O. Fokou, and D. Koutsouris, “An indoor navigation system
for visually impaired and elderly people based on radio frequency identification (rfid),”
Information Sciences, vol. 320, pp. 288–305, 2015.
[51] M. Kohne and J. Sieck, “Location-based services with ibeacon technology,” in 2014 2nd
International Conference on Artificial Intelligence, Modelling and Simulation, pp. 315–
321, 2014.
[52] A. Golding and N. Lesh, “Indoor navigation using a diverse set of cheap, wearable
sensors,” in Digest of Papers. Third International Symposium on Wearable Computers,
pp. 29–36, 1999.
[53] J. Dong, M. Noreikis, Y. Xiao, and A. Ylä-Jääski, “Vinav: A vision-based indoor navi-
gation system for smartphones,” IEEE Transactions on Mobile Computing, vol. 18, no. 6,
pp. 1461–1475, 2019.
[54] Wen Lian, Hsu, Reduction Algorithm. Information Retrieval Workshop, 2021.
[55] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting pre-trained models for
Chinese natural language processing,” in Proceedings of the 2020 Conference on Empiri-
cal Methods in Natural Language Processing: Findings, (Online), pp. 657–668, Associa-
tion for Computational Linguistics, Nov. 2020.
[56] J. Brooke, ”SUS-A quick and dirty usability scale.” Usability evaluation in industry. CRC
Press, June 1996. ISBN: 9780748404605.

簡易檢索 / 詳目顯示

相關論文