針對實體化交談介面開發基於行為衡量方法於自閉症小孩之評估系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	賴柏村 Lai,Po-Tsun
論文名稱：	針對實體化交談介面開發基於行為衡量方法於自閉症小孩之評估系統 Toward automatic assessment of child with autism using embodied conversational agents based on behavior-based measurement
指導教授：	李祈均 Lee,Chi-Chun
口試委員:	冀泰石 Chi,Tai-Shih 劉奕汶 Liu,Yi-Wen 曹昱 Tsao,Yu
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2016
畢業學年度：	105
語文別：	中文
論文頁數：	51
中文關鍵詞：	泛自閉症障礙、實體化交談介面、自閉症診斷觀察量表、人類行為訊號處理
外文關鍵詞：	Autism spectrum disorder, Embodied conversational agents, Autism diagnostic observation schedule, Behavioral signal processing
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

泛自閉症障礙在諸多醫學研究中常被指出有社交活動、溝通困難、或者重複行為的問題，導致語言和非語言的行為表現上特別不善於處理。而實體化交談介面於多方領域研究中，表明是可以改善關於社交能力、溝通技巧、或是特定群體不擅長項目，在自閉症案例裡時常被用來解決自閉症患者的普遍性問題，像是幫助和促進自閉症患者在自然行為上的表現，包括口語、情緒識別、肢體動作。此外為了解自閉症的症狀程度，泛自閉症障礙者都會擁有由訓練有素的專業醫生評量的標準化觀察行為量表分數，通常於特定的情境中用來衡量小孩在溝通、社交互動以及綜合能力之三大核心領域的反應。然而，現今人工評量的方式有人為因素、耗時或不易擴展的問題，使得多數資訊無法有效被利用。因此在本文中，為實現自動化自閉症評估系統以及大規模執行，設計了一種系統架構在實體化交談介面上來進行解決，並開發基於行為衡量方法，用以幫助人們早期發現自閉症的症狀。而整體系統架構是由一種低階多模態訊號特徵、中階行為特徵和高階之自閉症診斷觀察量表三者所組成的關聯模式，且應用人類行為訊號處理技術的概念實作。最後，希望藉由數位科技的支持下，期望能夠給予於人們更便利的診斷工具，或者提供專家在決策上的客觀參考，以改善人們的日常生活。

In medical research, autism spectrum disorder (ASD) is known as having social problems such as social interaction deficit, difficulties of communication, and repetitive behavior. Especially for the children, people with ASD have difficulties on dealing with verbal and non-verbal cues in social interaction. Some research indicate that embodied conversational agents (ECA) is helpful for improving social capabilities, or communication skills. In the case of autism, ECA is often used to solve the general problem in children with autism. For example, it can be used to elicit the natural behavioral performance of the autism patients, including verbal, emotion recognition, or body movement. To evaluate the syndrome in autism spectrum, a gold standard diagnostic tools-Autism diagnostic observation schedule (ADOS) is used to assess the severity of autism in clinical assessment of ASD. ADOS is usually conducted by professionals that are familiar with autistic disorders, and it measures social impairments in three core developmental domains: communication, reciprocal social interaction, communication and social. However, because there are existing problems like subjective evaluation, time-consuming, and non-scalable in manually assessment method, most of the information cannot be effectively utilized. Therefore, in this paper, we design an automatic assessment system based on behavior-based measurement to provide an early diagnosis with using ECA, and realize an automation autism diagnostic framework by behavioral signal processing (BSP) technique which consists of low-level multimodal signal feature, mid-level behavior feature, and high-level ADOS score. In the future, we expect to provide more convenient diagnostic tools to experts with decision-making objective reference and improve people's daily lives.

誌謝    i
中文摘要    ii
Abstract    iii
目錄    iv
圖目錄    v
表目錄    vi
一、    序論    1
二、    研究    5
2.1    Rachel資料庫    6
2.2    自定義行為標記    7
2.3    語言和非語言行為描述    9
2.3.1    聲音行為特徵    10
2.3.2    全局動作行為特徵    13
2.3.3    臉部表達行為特徵    16
2.3.4    敘事行為特徵    20
2.4    基於時域特性之全時整合編碼    22
2.4.1    詞袋模型    22
2.4.2    費雪向量    23
2.4.3    局部聚合描述符向量    25
2.4.4    泛函數    27
三、    實驗    28
3.1    實驗細節說明    30
3.2    自定義行為辨識    33
3.3    由訊號生成行為特徵之自動辨識    37
3.4    訊號特徵辨識    42
四、    結論    45
參考文獻    47

                                

[1] Narayanan, Shrikanth, and Panayiotis G. Georgiou. “Behavioral signal processing: Deriving human behavioral informatics from speech and language.” Proceedings of the IEEE, vol.101, no.5, pp.1203-1233, 2013.
[2] Chen, Wei-Chen, et al. “Multimodal arousal rating using unsupervised fusion technique.” 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp.5296-5300, 2015.
[3] Delaherche, Emilie, et al. “Assessment of the communicative and coordination skills of children with autism spectrum disorders and typically developing children using social signal processing.” Research in Autism Spectrum Disorders, vol.7, no.6, pp.741-756, 2013.
[4] Hsiao, Shan-Wen, et al. “A Multimodal Approach for Automatic Assessment of School Principals' Oral Presentation During Pre-Service Training Program.” Sixteenth Annual Conference of the International Speech Communication Association. 2015.
[5] Black, Matthew P., et al. “Toward automating a human behavioral coding system for married couples’ interactions using speech acoustic features.” Speech Communication, vol.55, no.1, pp.1-21, 2013.
[6] P. Association, “Diagnostic and statistical manual of mental disorders (4th ed., text rev.).” American Psychiatric Publishing, Inc., 2000.
[7] Botturi, Luca, Chiara Bramani, and Sara Corbino. “Digital storytelling for social and international development: from special education to vulnerable children.” International Journal of Arts and Technology, vol.7, no.1, pp.92-111, 2014.
[8] Even, Cindy, et al. “Supporting Social Skills Rehabilitation with Virtual Storytelling.” Twenty-Ninth International Florida Artificial Intelligence Research Society Conference. AAAI publications, pp.329-334, 2016.
[9] Ring, Lazlo, et al. “Addressing loneliness and isolation in older adults: Proactive affective agents provide better support.” Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, pp.61-66, 2013.
[10] Tartaro, Andrea, and Justine Cassell. “Playing with virtual peers: bootstrapping contingent discourse in children with autism.” Proceedings of the 8th international conference on International conference for the learning sciences-Volume 2. International Society of the Learning Sciences, pp.382-389, 2008.
[11] Anderson, Keith, et al. “The TARDIS framework: intelligent virtual agents for social coaching in job interviews.” Advances in computer entertainment. Springer International Publishing, pp.476-491, 2013.
[12] Tartaro, Andrea, and Justine Cassell. “Using virtual peer technology as an intervention for children with autism.” Towards universal usability: designing computer interfaces for diverse user populations. Chichester: John Wiley, vol.231, pp.62, 2007.
[13] S. S. Narayanan and A. Potamianos, “Creating conversational interfaces for children,” IEEE Transactions on Speech and Audio Processing, vol.10, no. 2, pp. 65–78, 2002.
[14] Mower, Emily, et al. “Rachel: Design of an emotionally targeted interactive agent for children with autism.” Multimedia and Expo (ICME), 2011 IEEE International Conference on. IEEE, pp.1-6, 2011.
[15] Lord, Catherine, et al. “The Autism Diagnostic Observation Schedule—Generic: A standard measure of social and communication deficits associated with the spectrum of autism.” Journal of autism and developmental disorders, vol.30, no.3, pp.205-223, 2000.
[16] Lord, Catherine, et al. “Austism diagnostic observation schedule: A standardized observation of communicative and social behavior.” Journal of autism and developmental disorders, vol.19, no.2, pp.185-212, 1989.
[17] Akshoomoff, Natacha, Christina Corsello, and Heather Schmidt. “The role of the autism diagnostic observation schedule in the assessment of autism spectrum disorders in school and community settings.” The California School Psychologist, vol.11, no.1, pp.7-19, 2006.
[18] Baucom, Brian R., and Esti Iturralde. “A behaviorist manifesto for the 21 st century,” Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific. IEEE, pp.1-4, 2012.
[19] Lord, Catherine, Michael Rutter, and Ann Le Couteur. “Autism Diagnostic Interview-Revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders.” Journal of autism and developmental disorders, vol.24, no.5, pp.659-685, 1994.
[20] Bailly, Gérard, Stephan Raidt, and Frédéric Elisei. “Gaze, conversational agents and face-to-face communication.” Speech Communication, vol.52, no.6, pp.598-612, 2010.
[21] Bal, Elgiz, et al. “Emotion recognition in children with autism spectrum disorders: Relations to eye gaze and autonomic state.” Journal of autism and developmental disorders, vol.40, no.3, pp.358-370, 2010.
[22] de Marchena, Ashley, and Inge‐Marie Eigsti. “Conversational gestures in autism spectrum disorders: Asynchrony but not decreased frequency.” Autism research, vol.3, no.6, pp.311-322, 2010.
[23] Bellard, Fabrice, et.al. “FFmpeg.” Availabel from: https://ffmpeg.org/.
[24] Boersma, Paul. “Praat, a system for doing phonetics by computer.” Glot international, vol.5, no.9/10, pp.341-345, 2002.
[25] McFee, Brian, et al. “librosa: Audio and music signal analysis in python.” Proceedings of the 14th Python in Science Conference. 2015.
[26] Davis, Steven, and Paul Mermelstein. “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences.” IEEE transactions on acoustics, speech, and signal processing, vol.28, no.4, pp. 357-366, 1980.
[27] Boersma, Paul. “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound.” Proceedings of the institute of phonetic sciences. vol.17, no.1193, pp.97-110, 1993.
[28] Wang, Heng, et al. “Action recognition by dense trajectories.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp.3169-3176, 2011.
[29] Wang, Heng, and Cordelia Schmid. “Action recognition with improved trajectories.” Proceedings of the IEEE International Conference on Computer Vision., pp.3551-3558, 2013.
[30] Baraldi, Lorenzo, et al. “Gesture recognition in ego-centric videos using dense trajectories and hand segmentation.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops., pp.688-693, 2014.
[31] Baltru, Tadas, Peter Robinson, and Louis-Philippe Morency. “OpenFace: an open source facial behavior analysis toolkit.” 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp.1-10, 2016.
[32] Baltrusaitis, Tadas, Peter Robinson, and Louis-Philippe Morency. “Constrained local neural fields for robust facial landmark detection in the wild.” Proceedings of the IEEE International Conference on Computer Vision Workshops., pp.354-361, 2013.
[33] Wood, Erroll, et al. “Rendering of eyes for eye-shape registration and gaze estimation.” 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, pp.3756-3764, 2015.
[34] Baltrušaitis, Tadas, Marwa Mahmoud, and Peter Robinson. “Cross-dataset learning and person-specific normalisation for automatic Action Unit detection.” Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, IEEE, vol.6, pp.1-6, 2015.
[35] Matthews, Iain, and Simon Baker. “Active appearance models revisited.” International Journal of Computer Vision, vol.60, no.2, pp.135-164, 2004.
[36] Cambria, Erik, and Bebo White. “Jumping NLP curves: a review of natural language processing research [review article].” IEEE Computational Intelligence Magazine, vol.9, no.2, pp.48-57, 2014.
[37] Justice, Laura M., et al. “A scalable tool for assessing children's language abilities within a narrative context: The NAP (Narrative Assessment Protocol).” Early Childhood Research Quarterly, vol.25, no.2, pp.218-234, 2010.
[38] Loria, Steven. “TextBlob: simplified text processing.” Secondary TextBlob: Simplified Text Processing (2014).
[39] Csurka, Gabriella, et al. “Visual categorization with bags of keypoints.” Workshop on statistical learning in computer vision, ECCV, vol.1, no.1-22, 2004.
[40] Sivic, Josef, and Andrew Zisserman. “Efficient visual search of videos cast as text retrieval.” IEEE transactions on pattern analysis and machine intelligence, vol.31, no.4, pp. 591-606, 2009
[41] Perronnin, Florent, and Christopher Dance. “Fisher kernels on visual vocabularies for image categorization.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp.1-8, 2007.
[42] Perronnin, Florent, Jorge Sánchez, and Thomas Mensink. “Improving the fisher kernel for large-scale image classification.” European conference on computer vision. Springer Berlin Heidelberg, pp.143-156, 2010.
[43] Jégou, Hervé, et al. “Aggregating local descriptors into a compact image representation.” Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, pp.3304-3311, 2010.
[44] Bone, Daniel, et al. “Intoxicated Speech Detection by Fusion of Speaker Normalized Hierarchical Features and GMM Supervectors.” INTERSPEECH, pp.3217-3220, 2011.
[45] Csurka, Gabriela, and Florent Perronnin. “Fisher vectors: Beyond bag-of-visual-words image representations.” International Conference on Computer Vision, Imaging and Computer Graphics. Springer Berlin Heidelberg, 2010.
[46] Sánchez, Jorge, et al. “Image classification with the fisher vector: Theory and practice.” International journal of computer vision, vol.105, no.3, pp.222-245, 2013.
[47] Platt, John. “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.” Advances in large margin classifiers, vol.10, no.3, pp.61-74, 1999.
[48] Schuller, Björn, Stefan Steidl, and Anton Batliner. “The INTERSPEECH 2009 emotion challenge.” INTERSPEECH, vol. 2009, pp.312-315, 2009.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文