研究生: |
陳恩琦 Chen, En-Chi |
---|---|
論文名稱: |
利用持續性深度學習促進肢體障礙者與電腦介面之互動模式 Enhance Interaction of Physically Challenged People with Computer Interface through Continual Deep Learning |
指導教授: |
郭柏志
Kuo, Po-Chih |
口試委員: |
陳顥齡
Chen, Hao-Ling 郭佩宜 Kuo, Pei-Yi |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 75 |
中文關鍵詞: | 持續性深度學習 、肢體障礙者 、電腦介面 |
外文關鍵詞: | Continual Deep Learning, Physically Challenged People, Computer Interface |
相關次數: | 點閱:47 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
電腦與網路除了成為現代人生活密不可分的一部分,更是搭建了不便出門 的身障者與世界溝通的橋樑。現今針對肢體障礙者的電腦操作研究大多聚焦 在硬體層面。而有了硬體工具後,再加上軟體輔助將能提供更友善的操作體 驗。本研究著重於使用鍵盤快捷鍵的情境,並設計出SEST(Subtle Expression Shortcut Tab 微表情快捷鍵面板)系統,此系統以持續性深度學習為主要方 法,利用臉部微表情作為啟動信號,以達到更省時省力的複合快捷鍵操作。 此外為了避免持續性深度學習中的災難式遺忘(catastrophic forgetting),本 研究提出MIRNCM的演算法。接著我們執行實驗以驗證SEST和MIRNCM的有 效性。我們一共招募了十位非肢體障礙者和三位嚴重肢體障礙者(僅能以食 指進行點擊),並以常見的Google文件操作作為測驗情境。非肢體障礙者在 使用SEST後,操作時間與總鼠標移動距離皆有顯著下降(p-value小於0.05)。 肢體障礙者則是在使用SEST後,操作時間與總鼠標移動距離有超過30%的下 降率。我們將MIRNCM與另外三個記憶抽取演算法:經驗重現(Experience Replay), 最大干擾抽取(Maximally Interfered Retrieval)和最近種類平均 (Nearest Class Mean)進行比較。最後MIRNCM以高於0.95的總體準確率勝過 其他三個演算法。總結上述,本研究提出之SEST能達到省時省力的快捷鍵操 作,而MIRNCM亦能有效避免災難式遺忘。
Computers and networks have become integral parts of modern society. Moreover, they bridge the communication gap for physically challenged people who find it inconvenient to leave home. Most studies on computer interaction for physically challenged individuals focus on hardware. However, with the assistance of software, hardware tools can provide a more efficient and user-friendly experience. Our research focused on the context of typing keyboard shortcuts. We designed a more efficient and effort-saving keyboard shortcut tool, SEST (Subtle Expression Shortcut Tab), which utilizes continual deep learning as the main method and takes facial expressions as the initiating signal. We proposed MIRNCM as our core strategy to prevent catastrophic forgetting in continual learning. Next, we conducted experiments to validate the efficiency of SEST and MIRNCM. We recruited 10 participants without physical challenges and 3 participants with severe physical challenges (who can only perform clicks using their index finger) and adopted common operations of Google Documents as the testing context. Participants without physical challenges showed a significant decrease (p-value < 0.05) in time spent and total cursor distance when using SEST. Physically challenged participants experienced a decrease of over 30% in both time spent and total cursor distance with SEST. MIRNCM was compared with three other memory retrieval methods: Experience Replay, Maximally Interfered Retrieval, and Nearest Class Mean, and outperformed all of them with an overall accuracy greater than 0.95. In conclusion, SEST can achieve time and effort efficiency during shortcut operations, and MIRNCM can effectively prevent catastrophic forgetting.
[1] Rahaf Aljundi, Eugene Belilovsky, Tinne Tuytelaars, Laurent Charlin, Massimo Caccia, Min Lin, and Lucas Page-Caccia. Online continual learning with maximal interfered retrieval. Advances in neural information processing systems, 32, 2019.
[2] G Azam and MT Islam. Design and fabrication of a voice controlled wheelchair for physically disabled people. In International Conference on Physics Sustainable Development & Technology (ICPSDT-2015), volume 1, pages 81–90, 2015.
[3] Xianye Ben, Yi Ren, Junping Zhang, Su-Jing Wang, Kidiyo Kpalma, Weixiao Meng, and Yong-Jin Liu. Video-based facial micro-expression analysis: A survey of datasets, features and algorithms. IEEE transactions on pattern analysis and machine intelligence, 44(9):5826–5846, 2021.
[4] Rupert Bourne, Jaimie D Steinmetz, Seth Flaxman, Paul Svitil Briant,
Hugh R Taylor, Serge Resnikoff, Robert James Casson, Amir Abdoli, Eman
Abu-Gharbieh, Ashkan Afshin, et al. Trends in prevalence of blindness and distance and near vision impairment over 30 years: an analysis for the global burden of disease study. The Lancet global health, 9(2):e130–e143, 2021.
[5] Gary Bradski. The opencv library. Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 25(11):120–123, 2000.
[6] Martin G. Brodwin, Tristen Star, and Elizabeth Cardoso. Computer assistive technology for people who have disabilities: Computer adaptations and modifications. Journal of rehabilitation, 70(3):28–33, Jul-Sep 2004 2004///Jul-Sep. copyright - Copyright National Rehabilitation Counseling Association Jul-Sep 2004; last update - 2023-11-28.
[7] Hubert Cecotti. A multimodal gaze-controlled virtual keyboard. IEEE Transactions on Human-Machine Systems, 46(4):601–606, 2016.
[8] Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam
Ajanthan, Puneet Kumar Dokania, Philip HS Torr, and Marcaurelio Ranzato.
On tiny episodic memories in continual learning. arxiv. Learning, 6(7), 2019.
[9] Salvatore Crisafulli, Janet Sultana, Andrea Fontana, Francesco Salvo, Sonia Messina, and Gianluca Trifir`o. Global epidemiology of duchenne muscular dystrophy: an updated systematic review and meta-analysis. Orphanet journal of rare diseases, 15:1–20, 2020.
[10] Adrian Davison, Walied Merghani, Cliff Lansley, Choon-Ching Ng, and
Moi Hoon Yap. Objective micro-facial movement detection using facs-based
regions and baseline evaluation. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pages 642–649. IEEE, 2018.
[11] Paul Ekman and Wallace V Friesen. Facial action coding system. Environmental Psychology & Nonverbal Behavior, 1978.
[12] Alan EH Emery. The muscular dystrophies. The Lancet, 359(9307):687–695, 2002.
[13] Farah Binte Haque, Tawhid Hossain Shuvo, and Riasat Khan. Head motion controlled wheelchair for physically disabled people. In 2021 Second International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), pages 1–6. IEEE, 2021.
[14] Tyler L. Hayes, Nathan D. Cahill, and Christopher Kanan. Memory efficient experience replay for streaming learning. In 2019 International Conference on Robotics and Automation (ICRA), pages 9769–9776, 2019.
[15] Jiangpeng He, Runyu Mao, Zeman Shao, and Fengqing Zhu. Incremental
learning in online scenario. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13926–13935, 2020.
[16] Md Ahsan Iqbal, SK Asrafuzzaman, Md Mahfuz Arifin, and SK Alamgir
Hossain. Smart home appliance control system for physically disabled people using kinect and x10. In 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pages 891–896. IEEE, 2016.
[17] Pascal J Kieslich, Felix Henninger, Dirk U Wulff, Jonas MB Haslbeck, and Michael Schulte-Mecklenbeck. Mouse-tracking: A practical guide to implementation and analysis 1. In A handbook of process tracing methods, pages 111–130. Routledge, 2019.
[18] James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
[19] Soochan Lee, Junsoo Ha, Dongsu Zhang, and Gunhee Kim. A neural dirichlet process mixture model for task-free continual learning. arXiv preprint arXiv:2001.00689, 2020.
[20] Blaine Lewis, Greg d’Eon, Andy Cockburn, and Daniel Vogel. Keymap: Improving keyboard shortcut vocabulary using norman’s mapping. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–10, 2020.
[21] Zhipeng Li, Xiaoyu Zhou, and Shoujun Huang. Managing skill certification in online outsourcing platforms: a perspective of buyer-determined reverse auctions. International Journal of Production Economics, 238:108166, 2021.
[22] Kuang Liu, Mingmin Zhang, and Zhigeng Pan. Facial expression recognition with cnn ensemble. In 2016 International Conference on Cyberworlds (CW), pages 163–166, 2016.
[23] Zhe Liu, Chunyang Chen, Junjie Wang, Mengzhuo Chen, Boyu Wu, Yuekai
Huang, Jun Hu, and Qing Wang. Unblind text inputs: Predicting hint-text
of text input in mobile apps via llm. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, New York, NY, USA, 2024.
Association for Computing Machinery.
[24] Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha
Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong,
Juhyun Lee, et al. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172, 2019.
[25] Zheda Mai, Ruiwen Li, Jihwan Jeong, David Quispe, Hyunwoo Kim, and
Scott Sanner. Online continual learning in image classification: An empirical survey. Neurocomputing, 469:28–51, 2022.
[26] Walied Merghani and Moi Hoon Yap. Adaptive mask for region-based facial micro-expression recognition. In 2020 15th IEEE international conference on automatic face and gesture recognition (FG 2020), pages 765–770. IEEE, 2020.
[27] Ruquia Mirza, Ayesha Tehseen, and AV Joshi Kumar. An indoor navigation approach to aid the physically disabled people. In 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET), pages 979–983. IEEE, 2012.
[28] Progress Mtshali and Freedom Khubisa. A smart home appliance control system for physically disabled people. In 2019 Conference on Information Communications Technology and Society (ICTAS), pages 1–5. IEEE, 2019.
[29] Zheng Ning, Brianna L Wimer, Kaiwen Jiang, Keyi Chen, Jerrick Ban,
Yapeng Tian, Yuhang Zhao, and Toby Jia-Jun Li. Spica: Interactive video
content exploration through augmented audio descriptions for blind or low-vision viewers. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, New York, NY, USA, 2024. Association for Computing Machinery.
[30] Bernhard Obermaier, Gemot R Muller, and Gert Pfurtscheller. ” virtual keyboard” controlled by spontaneous eeg activity. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(4):422–426, 2003.
[31] Nassim Rad, Haibi Cai, and Michael D Weiss. Management of spinal muscular atrophy in the adult population. Muscle & Nerve, 65(5):498–507, 2022.
[32] Gulnar Rakhmetulla and Ahmed Sabbir Arif. Crownboard: A one-finger
crown-based smartwatch keyboard for users with limited dexterity. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA, 2023. Association for Computing Machinery.
[33] Veikko Surakka, Marko Illi, and Poika Isokoski. Gazing and frowning as a new human–computer interaction technique. ACM Transactions on Applied Perception (TAP), 1(1):40–56, 2004.
[34] Ali Bulent Usakli and Serkan Gurkan. Design of a novel efficient human–computer interface: An electrooculagram based virtual keyboard. IEEE transactions on instrumentation and measurement, 59(8):2099–2108, 2009.
[35] Tess Van Daele, Akhil Iyer, Yuning Zhang, Jalyn C Derry, Mina Huh, and Amy Pavel. Making short-form videos accessible with hierarchical video summaries. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, New York, NY, USA, 2024. Association for Computing Machinery.
[36] Nguyen Van Quang, Jinhee Chun, and Takeshi Tokuyama. Capsulenet for
micro-expression recognition. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pages 1–7. IEEE, 2019.
[37] M Vasanthan, M Murugappan, R Nagarajan, Bukhari Ilias, and J Letchumikanth. Facial expression based computer cursor control system for assisting physically disabled person. In 2012 IEEE International Conference on Communication, Networks and Satellite (ComNetSat), pages 172–176. IEEE, 2012.
[38] Jennifer Wortman Vaughan. Making better use of the crowd: How crowd-sourcing can advance machine learning research. Journal of Machine Learning Research, 18(193):1–46, 2018.
[39] Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. A comprehensive
survey of continual learning: Theory. Method and Application. arXiv, 2302:v2, 2023.
[40] Eric W Weisstein. Bonferroni correction. https://mathworld. wolfram. com/, 2004.
[41] Johann Wentzel, Sasa Junuzovic, James Devine, John Porter, and Martez Mott. Understanding how people with limited mobility use multi-modal input. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–17, 2022.
[42] Ning Zhou, Renyu Liang, and Wenqian Shi. A lightweight convolutional neural network for real-time facial expression detection. IEEE Access, 9:5573–5584, 2020.