研究生: |
林俊廷 Lin, Chun-Ting |
---|---|
論文名稱: |
基於深度強化學習方法下,針對接收訊號強度指標進行室內定位 Indoor Positioning via Received Signal Strength Indicators Using Deep Reinforcement Learning |
指導教授: |
鐘太郎
Jong, Tai-Lang |
口試委員: |
廖梨君
Liao, Li-Chun 黃裕煒 Huang, Yu-Wei 謝奇文 Hsieh, Chi-Wen 鐘太郎 Jong, Tai-Lang |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 85 |
中文關鍵詞: | 深度強化學習 、機器學習 、人工智慧 、室內定位 、接收訊號強度指標 、變分自動編碼器 、物聯網 |
外文關鍵詞: | Deep reinforcement learning, Machine learning, Artificial intelligence, Indoor positioning, Received signal strength indicators, Variational autoencoder, Internet of things |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主要利用無線接收訊號強度指標資料做室內定位,接收訊號強度指標常常被使用在藍芽Beacon設備之室內定位方法中,對於配有Beacon設備之移動或固定的物體定位也有許多的應用層面。本論文使用深度強化學習預測物體位於室內所在的位置,並與其他著名的機器學習方法之實驗結果做比較以及討論。
本論文首先嘗試了不包含過去經驗下,每次只用單一筆資料測試在不同的環境情況下使用深度強化學習之方法,包含只考慮最簡化環境情況,接著考慮室內人員移動所帶來的雜訊干擾之影響,再考慮室內障礙物所帶來的雜訊干擾之影響,最後再考慮深度強化學習模型組合變分自動編碼器模型之影響,從隨機選取200筆之未標記資料的測試結果發現在最後一種組合環境假設下得到定位誤差為7.92公尺。另外本論文著重於使用標記資料以及未標記資料,考慮過去經驗訓練深度強化學習模型,並重複10次隨機選取200筆未標記資料測量,預測到的平均距離誤差總平均值僅有5.31公尺,另外也重複10次隨機選取200筆標記資料測量預測平均距離誤差,所得到的總平均值也僅有5.18公尺。
除了使用深度強化學習組合變分自動編碼器之方法做定位,本論文並使用其他著名的機器學習方法之實驗結果來做比較,包括非監督式學習中的變分自動編碼器與K-means分群法,以及監督式學習中的卷積神經網路,並使用與深度強化學習組合變分自動編碼器相同的資料,重複10次隨機選取200筆標記資料與未標記資料來測量,對於未標記資料得到平均距離誤差之總平均值分別為11.61、7.08與6.27公尺;對於標記資料得到平均距離誤差之總平均值分別為12.36、7.2與5.99公尺,得知使用深度強化學習組合變分自動編碼器之方法皆得到比較好的結果。
The thesis is mainly studying indoor positioning using the wireless received signal strength indicators (RSSIs) data. RSSIs are often utilized in indoor positioning method using Bluetooth Beacon devices. There are many applications in moving or fixed object positioning with Beacon devices. This paper aims to use deep reinforcement learning (DRL) to predict where an object is located indoors and also compare and discuss with the experimental results of other well-known machine learning methods.
This thesis first tries DRL method using single data each time in different environments that we don’t consider past experience yet, including only under the most simplified environment, and then we consider noise interference caused by the movement of indoor people, then we add noise interference caused by the indoor obstacles into consideration, and then we consider the DRL model combined with variational autoencoder (VAE) model, from the randomly selected of 200 unlabeled data, the distance error is 7.92m under the last combined environment . In addition, the thesis focuses on the use of labeled data and unlabeled data to train DRL+VAE model considering past experience, and the total average distance errors obtained by measuring predicted average distance errors 10 times of 200 randomly selected labeled data and unlabeled data are only 5.31m and 5.18m, respectively.
In addition to using DRL+VAE method, this thesis also uses the experimental results of other famous machine learning methods to compare, including VAE and K-means algorithm which are unsupervised learning and convolutional neural network (CNN) which is supervised learning, and we use the same data as the DRL+VAE method to measure 10 times of the randomly selected labeled and unlabeled data. The total average values of the average distance errors for unlabeled data is 11.61, 7.08 and 6.27 meters, respectively. The total average values of the average distance error for labeled data is 12.36, 7.2, and 5.99 meters, respectively. Therefore, DRL+VAE model gets better results.
[1] G. Dedes and A. G. Dempster, "Indoor GPS positioning - challenges and opportunities," VTC-2005-Fall. 2005 IEEE 62nd Vehicular Technology Conference, 2005., Dallas, TX, USA, 2005, pp. 412-415.
[2] Hartley, R.I., Sturm, P.: Triangulation. Computer Vision and Image Understanding Journal (CVIU) 68(2) (1997) 146–157
[3] M. Li and Y. Lu, "Angle-of-arrival estimation for localization and communication in wireless networks," 2008 16th European Signal Processing Conference, Lausanne, 2008, pp. 1-5.
[4] R. Kaune, "Accuracy studies for TDOA and TOA localization," 2012 15th International Conference on Information Fusion, Singapore, 2012, pp. 408-415.
[5] F. Zafari, A. Gkelias and K. K. Leung, "A Survey of Indoor Localization Systems and Technologies," in IEEE Communications Surveys & Tutorials.
[6] C. Chen, Y. Chen, H. Lai, Y. Han and K. J. R. Liu, "High accuracy indoor localization: A WiFi-based approach," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 6245-6249.
[7] Nowicki M., Wietrzykowski J. (2017) Low-Effort Place Recognition with WiFi Fingerprints Using Deep Learning. In: Szewczyk R., Zieliński C., Kaliczyńska M. (eds) Automation 2017. ICA 2017. Advances in Intelligent Systems and Computing, vol 550. Springer, Cham
[8] H. A. Nahas and J. S. Deogun, "Radio Frequency Identification Applications in Smart Hospitals," Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07), Maribor, 2007, pp. 337-342.
[9] S. Holm, "Hybrid ultrasound-RFID indoor positioning: Combining the best of both worlds," 2009 IEEE International Conference on RFID, Orlando, FL, 2009, pp. 155-162.
[10] S. Gezici et al., "Localization via ultra-wideband radios: a look at positioning aspects for future sensor networks," in IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 70-84, July 2005.
[11] M. Mohammadi, A. Al-Fuqaha, M. Guizani and J. Oh, "Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services," in IEEE Internet of Things Journal, vol. 5, no. 2, pp. 624-635, April 2018.
[12] Y. Wang, Q. Ye, J. Cheng and L. Wang, "RSSI-Based Bluetooth Indoor Localization," 2015 11th International Conference on Mobile Ad-hoc and Sensor Networks (MSN), Shenzhen, 2015, pp. 165-171.
[13] S. Feldmann, K. Kyamakya, A. Zapater, and Z. Lue, “An indoor Bluetooth-based positioning system: Concept, implementation and experimental evaluation,” in Proc. ICWN, vol. 272. Las Vegas, NV, USA, 2003, pp. 109–113.
[14] M. Terán, J. Aranda, H. Carrillo, D. Mendez and C. Parra, "IoT-based system for indoor location using bluetooth low energy," 2017 IEEE Colombian Conference on Communications and Computing (COLCOM), Cartagena, 2017, pp. 1-6.
[15] Andy Cavallini (2014) iBeacon Bible [Online]. Available: https://meetingofideas.files.wordpress.com/2015/09/beacon-bible-3-0.pdf
[16] C. Gomez, J. Oller, and J. Paradells, “Overview and Evaluation of Bluetooth Low Energy: An Emerging Low-Power Wireless Technology,” Sensors, vol. 12, no. 9, pp. 11734–11753, Aug. 2012.
[17] G. Félix, M. Siller and E. N. Álvarez, "A fingerprinting indoor localization algorithm based deep learning," 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN), Vienna, 2016, pp. 1006-1011.
[18] C. Hsieh, J. Chen and B. Nien, "Deep Learning-Based Indoor Localization Using Received Signal Strength and Channel State Information," in IEEE Access, vol. 7, pp. 33256-33267, 2019.
[19] Ibrahim, Mai et al. “CNN based Indoor Localization using RSS Time-Series.” 2018 IEEE Symposium on Computers and Communications (ISCC) (2018): 01044-01049.
[20] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[21] F. Pernkopf, “Bayesian network classifiers versus selective k-NN classifiers”, Pattern recognition, vol. 38, no. 1, pp. 1-10, 2005
[22] Peng, C., Lee, K.L., & Ingersoll, G.M. (2002). An Introduction to Logistic Regression Analysis and Reporting.
[23] Quinlan, J. 1986. Induction of decision trees. Machine Learning
[24] L. Breiman. 2001. Random forests. Machine learning
[25] Hsu, C.-W & Chang, C.-C & Lin, C.-J. (2003). A Practical Guide to Support Vector Classification. 101. 1396-1400.
[26] Diederik P Kingma, Max Welling,” Auto-Encoding Variational Bayes” arXiv preprint arXiv:1312.6114, 2013
[27] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman and A. Y. Wu, "An efficient k-means clustering algorithm: analysis and implementation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, July 2002.
[28] J. Shlens. (2005, December) A tutorial on principal component analysis. [Online]. Available: http://www.cs.cmu.edu/∼elaw/papers/pca.pdf
[29] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of NIPS, pages 2672– 2680, 2014
[30] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra,
and M. Riedmiller, “Playing atari with deep reinforcement learning,”arXiv preprint arXiv:1312.5602, 2013.
[31] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 201
[32] F. Musumeci et al., "An Overview on Application of Machine Learning Techniques in Optical Networks," in IEEE Communications Surveys & Tutorials, vol. 21, no. 2, pp. 1383-1408, Secondquarter 2019.
[33] Osmankovic, Dinko & Konjicija, Samim. (2011). Implementation of Q - Learning algorithm for solving maze problem.. 1619-1622.
[34] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, 1998
[35] Yuxi Li. Deep Reinforcement Learning: An Overview. arXiv:1701.07274, 2017
[36] Christopher JCH Watkins and Peter Dayan. Q-learning. Machine learning, 8(3-4):279–292, 1992.
[37] Hinton, G. E. & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.
[38] M Bishop, Christopher. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics).
[39] Yeh James (2017) 資料分析-機器學習-第5-1講-卷積神經網絡介紹 [Online]. Available: https://medium.com/@yehjames/
[40] Michael Copeland 人工智慧、機器學習與深度學習間有什麼區別? [Online]. Available: https://blogs.nvidia.com.tw/2016/07/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
[41] 何維涓 模擬大腦學習過程,DeepMind用強化學習神經網路找出人類內化過去經驗解決新任務的關鍵 [Online]. Available: https://www.ithome.com.tw/news/123178
[42] C. Poyton, Digital Video and HDTV Algorithms and Interfaces. San Francisco, CA: Morgan Kaufmann, 2003.
[43] A. Singh, N. Thakur and A. Sharma, "A review of supervised machine learning algorithms," 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, 2016, pp. 1310-1315.
[44] Tommy Huang (2018) 機器學習: 集群分析 K-means Clustering [Online]. Available: https://medium.com/@chih.sheng.huang821/機器學習-集群分析
[45] Hinton, G.E., McClelland, J.L., & Rumelhart, D.E. (1986). Distributed representations. In D.E. Rumelhart & J.L. McClelland (Eds.), Parallel distributed processing: Explora tions in the microstructure of cognition. Cambridge, MA: MIT Press.
[46] jonbruner generative-adversarial-networks [Online]. Available: https://github.com/jonbruner/generative-adversarial-networks/
[47] Atari [Online]. Available: https://www.atari.com/
[48] Alpha Go [Online]. Available: https://deepmind.com/research/alphago/
[49] 李宏毅 (2017) Machine learning [Online]. Available: https://www.youtube.com/watch?v=CXgbekl66jc&list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49
[50] Sutton, R. S., McAllester, D. A., Singh, S. P., and Mansour, Y. (1999a). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems (NIPS) 12
[51] Code HeroKu Introduction to Reinforcement Learning — Part 1 [Online]. Available: https://medium.com/code-heroku/introduction-to-reinforcement-learning-67826ec177ea
[52] D. Ciregan, U. Meier and J. Schmidhuber, "Multi-column deep neural networks for image classification," 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012, pp. 3642-3649.
[53] Dan Klein (2013) Markov Decision Processes [Online]. Available: http://artificial-intelligence-class.org/assets/slides/11-Reinforcement-Learning.pdf
[54] Feller, W. (1971) Introduction to Probability Theory and Its Applications, Vol II (2nd edition),Wiley. Section I.3
[55] Metelli, A. M., Papini, M., Faccio, F., and Restelli, M. Policy optimization via importance sampling. In Advances in Neural Information Processing Systems, pp. 5447–5459, 2018.
[56] Python, https://www.python.org/
[57] Keras, https://keras.io/
[58] Tensorflow, https://www.tensorflow.org/
[59] Theano, http://deeplearning.net/software/theano/
[60] Kyle Bai (2018) TensorFlow 筆記 [Online]. Available: https://hackmd.io/s/HJxsUvOpg
[61] Tkinter, https://docs.python.org/3/library/tkinter.html
[62] S. Sadowski and P. Spachos, "RSSI-Based Indoor Localization With the Internet of Things," in IEEE Access, vol. 6, pp. 30149-30161, 2018.
[63] Ben-David, Shai; Kushilevitz, Eyal; Mansour, Yishay (1997-10-01). "Online Learning versus Offline Learning". Machine Learning. 29 (1): 45–63.
[64] Harris, David and Harris, Sarah. Digital design and computer architecture (2nd ed.). San Francisco, Calif.: Morgan Kaufmann
[65] Kullback, S.; Leibler, R. A. On Information and Sufficiency. Ann. Math. Statist. 22 (1951), no. 1, 79--86.
[66] 郭韋良."Bone Age Assessment and C. elegans Age Prediction Using Deep Convolutional Neural Network" 國立清華大學電機工程學系碩士論文(2018)