| 研究生: |
賴彥霖 |
|---|---|
| 論文名稱: |
基於自我介紹任務與圖片描述任務之輕度認知功能障礙檢測:使用語音與語言學特徵 Mild Cognitive Impairment Detection Based on Self-introduction Task and Picture Description Task Using Speech and Linguistic features |
| 指導教授: |
劉奕汶
LIU, YI-WEN |
| 口試委員: |
李祈均
LEE, CHI-CHUN 呂菁菁 LU, CHING-CHING 徐慧娟 |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 114 |
| 語文別: | 英文 |
| 論文頁數: | 87 |
| 中文關鍵詞: | 輕度認知功能障礙 、自我介紹任務 、圖片描述任務 、語音特徵 、語言特徵 、Delaware Corpus |
| 外文關鍵詞: | Mild Cognitive Impairment, Self-Introduction, Picture Description, Speech features, Language features, Delaware Corpus |
| 相關次數: | 點閱:31 下載:3 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著高齡化社會來臨與失智人口逐年上升,認知功能是一個需要被越來越重視的健康議題。失智症目前沒有痊癒的解藥,而且認知功能一經退化便無法回復到正常水平。若能及早發現並在認知功能障礙尚輕微時就介入,才有機會減緩退化速度。本研究的目的即在希望可利用說話的語音檢測出受試者是否有輕度認知功能障礙。
我們蒐集了來自台灣年長受試者的國語與閩南語混合語音資料,稱為 New Taipei 語料庫。受試者完成了自我介紹任務、Cookie Theft 圖片描述任務,以及 MMSE 與 MoCA 認知測驗。我們針對兩個語音任務的語音及其對應逐字稿設計了兩個實驗框架,以探討從健康控制組(HC)中偵測輕度認知障礙(MCI)的可行性。我們同時採用了基於音訊與文字的特徵擷取策略,共得到六類特徵:聲學特徵(Acoustic)、語速特徵(Speaking Rate)、時長特徵(Duration)、詞性特徵(Part-of-Speech, POS)、句法複雜度特徵(Syntactic Complexity)、以及詞彙豐富度特徵(Lexical Richness)。此外,亦將人口統計特徵納入分析。
本研究設計了兩個主要實驗。
第一個實驗使用 MMSE 與 MoCA 分數進行 HC/MCI 二分類任務。我們在這個實驗中進行了相關性分析,並使用與各認知測驗分數相關性最高的前二十個特徵進行人工特徵篩選,同時也採用序列特徵選擇法(Sequential Feature Selection)進行自動特徵篩選。接著分別對人工篩選的特徵及自動篩選的特徵建立分類任務,以支援向量機(SVM)作為分類器。
當認知障礙的程度是依據 MoCA 分數來決定時,使用自我介紹任務特徵並經由序列後向浮動選擇法(Sequential Backward Floating Selection, SBFS)所挑選的特徵,可使 HC/MCI 分類任務達到 82.67% 的 F1-score 與 85.24% 的準確率。
在實驗過程中,我們觀察到使用 MoCA 比起 MMSE 更能提供較佳的 MCI 與 HC 檢測能力。從人工與自動特徵選擇的結果,可發現圖片描述任務的特徵對 MMSE 所衡量的認知面向較為敏銳。相關性分析與自動選擇特徵的分類結果顯示,自我介紹任務的語言表現比起圖片描述任務,在偵測早期認知變化上更為敏銳。
第二個實驗旨在探討 Cookie Theft 圖片描述任務中語音與語言相關特徵的 HC/MCI 區辨潛力。為了進行跨語料庫比較,本研究從 DementiaBank 中擷取美國的英語語料 Delaware corpus。我們對六類特徵套用線性判別分析(LDA),以檢驗 MCI 與 HC 兩組之間的資料可分性。接著,挑選出兩個語料庫中最具區辨力的特徵類別,並將其 LDA 投影結果用於訓練 SVM 分類器。
在 New Taipei 語料庫中,聲學特徵與詞彙豐富度特徵展現較強的區辨能力;而在 Delaware 語料庫中,非聲學特徵(如語速、時長、詞性、句法複雜度與詞彙豐富度)具有較高的實用性。使用所選特徵之 LDA 投影組合訓練的 SVM 分類器,在 New Taipei 語料庫上達到 48.17% 的 F1-score 與 69.49% 的準確率,在 Delaware 語料庫上則為 56.91% 的 F1-score 與 57.94% 的準確率。結果顯示,採用資料集特定的特徵選擇策略可使 MCI 偵測效能有小幅但穩定的提升。
With the aging of society and the continuous increase in the prevalence of dementia, cognitive function has become an increasingly critical public health concern. Since there is currently no cure for dementia and cognitive decline is irreversible once it occurs, early detection is essential. Timely intervention allows healthcare resources to mitigate the progression of decline while cognitive impairment remains mild. The goal of this study is to explore the feasibility of detecting mild cognitive impairment (MCI) through speech analysis.
We collected a mixed Mandarin and Southern Min speech dataset from older subjects in Taiwan, referred to as the New Taipei corpus. Participants completed a self-introduction task, a cookie theft picture description task, and both the MMSE and MoCA cognitive assessments. Speech and their corresponding transcripts from the two speech tasks were analyzed using both audio-based and text-based feature extraction strategies, resulting in six categories of features: Acoustic, Speaking Rate, Duration, Part-of-Speech (POS), Syntactic Complexity, and Lexical Richness. Demographic features were also included in the analysis.
Two experimental frameworks were employed to investigate the feasibility of detecting MCI from healthy controls (HC). In the first experiment, binary classification was conducted using MMSE-based and MoCA-based labels. We conducted correlation analysis, selected the top-20 features correlated to different cognitive test scores, and investigated automatic feature selection based on sequential feature selection methods. Classification was performed for the manually selected features and the automatically selected features. When the cognitive impairment level is determined based on the MoCA score, self-introduction task features selected by sequential backward floating selection helps the HC/MCI classification reach F1-score of 82.67% and accuracy of 85.24%.
During the experiments, it was observed that using MoCA as the labeling criterion provided better discrimination between MCI and HC than using MMSE. From both manual and automatic feature selection results, the features obtained from the picture description task appeared more sensitive to the cognitive domains measured by MMSE. The correlation analysis and automatic selection classification results indicated that the language performance in the self-introduction task was more sensitive for detecting early cognitive changes than that in the picture description task.
The second experiment aims to investigate the HC/MCI discriminative potential of the speech and language-related features in the cookie theft picture description task. An American English dataset called the Delaware corpus was retrieved from DementiaBank for cross corpus comparison. Linear discriminant analysis (LDA) was applied to six feature categories to select the most discriminative feature categories of both datasets, and their LDA projection were used to train SVM classifiers. In the New Taipei dataset, Acoustic and Lexical Richness features demonstrated stronger discriminative power, while in Delaware dataset, non-acoustic features such as Speaking Rate, Duration, POS, Syntactic Complexity, and Lexical Richness exhibited higher utility. SVM classifiers trained on concatenated LDA projections of the selected features achieved an F1-score of 48.17% and an accuracy of 69.49% for the New Taipei dataset, and an F1-score of 56.91% and an accuracy of 57.94% for the Delaware dataset. The results suggest that adopting a dataset-specific feature selection strategy can mildly improve the performance of MCI detection.