簡易檢索 / 詳目顯示

研究生: 蔡易霖
Tsai, Yi Lin
論文名稱: 應用意見詞與面向詞配對於面向類別評價極性分類
Aspect-category-based Sentiment Classification with Aspect-Opinion Relation
指導教授: 許聞廉
Hsu, Wen-Lian
蔡宗翰
Tsai, Tzong Han
口試委員: 蘇豐文
Soo, Von Wen
張俊盛
Jason S. Chang
王浩全
Wang, Hao Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2015
畢業學年度: 104
語文別: 中文
論文頁數: 58
中文關鍵詞: 意見探勘面向類別
外文關鍵詞: opinion mining, aspect category
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,以面向類別為基礎的評價分析逐漸被重視,在這個任務下,需要判斷評論句於某個預先定義的面向類別{食物, 價錢, 服務, 氣氛, 其他}所表達出的評價。在本論文中,我們致力於 SemEval-2014 任務 4 中的兩個子任務:面向類別偵測(Aspect Category Detection,ACD)及面向類別情緒偵測(Aspect Category Polarity Detection,ACP)。在 SemEval-2014 中,多數的隊伍使用 n-grams 與情緒字典作為機器學習主要特徵。然而,對於一些各類別通用的意見詞(例如,「讚」可以用來描述每一個面向類別),這些特徵難以此區分出這些意見所描述的類別。相對地,我們首先抓取意見詞所描述的面向詞,並使用(意見詞,面向詞)配對作為特徵來解決這個困難。做法上,首先利用意見詞辨識系統找出評論句中的意見詞,再用依存規則(dependency rule)來判斷出對應的面向詞。本系統目前已完成於餐廳評論領域。實驗顯示,使用 Word2Vec 作為特徵可以達到 87.5% 的正確率,加上(意見詞,面向詞)配對特徵可以達到 88.3% 正確率。所有的特徵一起使用可以從 84.4% 提升到 89.0%。實驗結果顯示出該配對於面向類別情緒偵測下是有效的。


    In recent years, researches of aspect-category-based sentiment analysis have been approached in terms of predefined categories. In this paper, we target two sub-tasks of SemEval-2014 Task 4 dedicated to aspect-based sentiment analysis: detecting aspect category and aspect category polarity. Also, a pre-identified set of aspect categories {food, price, service, ambience, miscellaneous} defined by SemEval-2014 have been used in this paper. The majority of the submissions worked on these two sub-tasks with machine learning mainly with n-grams and sentiment lexicon features. The difficulty for these submissions is that some opinion word (e.g., "good") is general and cannot be referred to any particular category. By contrast, we use aspect-opinion pairs as one of the features in this paper to overcome this difficulty. To detect these pairs, we identify the opinion words in customer reviews, and then detect their related aspect terms by dependency rule. This system has been done on restaurant domain applying to Chinese customer reviews. Our experiment achieved 87.5% of accuracy by using Word2Vec to detect aspect category polarity. Aspect-opinion pair features employed in this system contribute to 88.3% of accuracy. When all features are employed, the accuracy is improved from 84.4% to 89.0%. Experimental results demonstrate the effectiveness of aspect-opinion pair features applied to the aspect-category-based sentiment classification system.

    第一章 緒論 1 1.1 傳統滿意度問卷調查與網路評論. . . 1 1.2 意見探勘與ABSA . . . . . . . .. 2 1.3 ABSA的架構與論文定位. . . . . . 2 1.4 現今ABSA的實務應用. . . . .. . . 4 1.5 SemEval-2014 Task 4 ABSA . . . 4 1.6 ABSA的挑戰與研究目的. . . . . . 5 1.7 論文架構. . . . . . . . . . . . 6 第二章 問題定義 8 2.1 面向類別偵測(ACD)問題定義. . . 8 2.2 面向類別情緒極性偵測(ACP)問題定義 8 第三章 文獻探討 9 3.1 意見詞、意見目標辨識. . . . . . . 9 3.1.1 Pool-based . . . . . . . . . 9 3.1.2 Mention-based . . . . . . . . 10 3.2 文件情緒分類. . . . . . . . . . . 11 3.3 面向類別評價極性分類. . . . . . . 12 3.3.1 ACD、ACP標竿資料集. . . . . . . 12 3.3.2 SemEval-2014 ABSA 參賽隊伍. . . 12 第四章 大量評論資料集 15 4.1 愛評網評論資料集. . . . . . . . . 15 4.2 Google 評論資料集. . . . . . . . . 15 第五章 面向類別與極性標註資料集 18 5.1 分類定義. . . . . . . . . . . . . 18 5.2 建構資料集. . . . . . . . . . . . 18 第六章 方法與評估 21 6.1 前處理:斷詞、詞性標註與否定詞處理. 21 6.2 建構字典. . . . . . . . . . . . . 22 6.2.1 PMI評價字典. . . . . . . . . . 22 6.2.2 PMI類別字典. . . . . . . . . . .22 6.2.3 Word2vec詞向量字典. . . . . . . 23 6.3 子系統:意見詞辨識(Opinion Word Identi cation, OWI) . . 27 6.3.1 意見詞辨識標註資料集. . . . . . . . . . . . . . . . . 28 6.3.2 Linear-chain CRFs . . . . . . . . . . . . . . . . . 28 6.3.3 意見詞辨識特徵. . . . . . . . . . . . . . . . . . . . 29 6.3.4 評估方法. . . . . . . . . . . . . . . . . . . . . . . 31 6.3.5 Baseline 組態. . . . . . . . . . . . . . . . . . . . 31 6.3.6 實驗設計與評估結果. . . . . . . . . . . . . . . . . . . 32 6.4 子系統:評分迴歸預測(Rating Regression, RR) . . . . . . 32 6.4.1 評分迴歸預測資料集. . . . . . . . . . . . . . . . . . . 33 6.4.2 支持向量迴歸(SVR) . . . . . . . . . . . . . . . . . . 34 6.4.3 評分迴歸預測特徵. . . . . . . . . . . . . . . . . . . . 36 6.4.4 評分迴歸預測評估方法. . . . . . . . . . . . . . . . . . 38 6.4.5 評分迴歸預測實驗結果. . . . . . . . . . . . . . . . . . 38 6.5 面向類別偵測(Aspect Category Detection, ACD) . . . . . .39 6.5.1 Support Vector Machine . . . . . . . . . . . . . . . . 39 6.5.2 面向類別偵測特徵. . . . . . . . . . . . . . . . . . . . 41 6.5.3 多標籤分類問題. . . . . . . . . . . . . . . . . . . . . 41 6.5.4 面向類別偵測評估方法. . . . . . . . . . . . . . . . . . 42 6.5.5 面向類別偵測實驗結果. . . . . . . . . . . . . . . . . . 42 6.6 面向類別極性(Aspect Category Polarity, ACP) . . . . . . 44 6.6.1 面向類別極性特徵. . . . . . . . . . . . . . . . . . . . 44 6.6.2 面向類別極性評估方法. . . . . . . . . . . . . . . . . . 48 6.6.3 面向類別極性實驗. . . . . . . . . . . . . . . . . . . . 48 第七章 結論 51

    [Chang and Lin, 2011] Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27.
    [Choi et al., 2006] Choi, Y., Breck, E., and Cardie, C. (2006). Joint extraction of entities and relations for opinion recognition. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing.
    [Choi et al., 2005] Choi, Y., Cardie, C., Riloff, E., and Patwardhan, S. (2005). Identifying sources of opinions with conditional random fields and extraction patterns. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing.
    [Cohen, 1960] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37-46.
    [Cortes and Vapnik, 1995] Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273-297.
    [Crammer and Singer, 2002] Crammer, K. and Singer, Y. (2002). On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research, 2:265-292.
    [Cristianini and Shawe-Taylor, 2000] Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 1 edition.
    [Cui et al., 2006] Cui, H., Mittal, V., and Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 2.
    [Das and Chen, 2001] Das, S. and Chen, M. (2001). Yahoo! for amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific finance association annual conference (APFA).
    [Daume III, 2009] Daumme III, H. (2009). Frustratingly easy domain adaptation. In Conference of the Association for Computational Linguistics (ACL).
    [Ganu et al., 2009] Ganu, G., Elhadad, N., and Marian, A. (2009). Beyond the stars: Improving rating predictions using review text content. In WebDB.
    [Hu and Liu, 2004] Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
    [Jin et al., 2009] Jin, W., Ho, H. H., and Srihari, R. K. (2009). Opinionminer: A novel machine learning system for web opinion mining and extraction. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
    [Joachims, 1999] Joachims, T. (1999). Advances in kernel methods. chapter Making Large-scale Support Vector Machine Learning Practical, pages 169-184. MIT Press.
    [Kiritchenko et al., 2014] Kiritchenko, S., Zhu, X., Cherry, C., and Mohammad, S. (2014). Nrc-canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014).
    [Klein and Manning, 2003] Klein, D. and Manning, C. D. (2003). Fast exact inference with a factored model for natural language parsing. In In Advances in Neural Information Processing Systems 15 (NIPS.
    [Kobayashi et al., 2007] Kobayashi, N., Inui, K., and Matsumoto, Y. (2007). Ex- tracting aspect-evaluation and aspect-of relations in opinion mining. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).
    [Ku and Chen, 2007] Ku, L.-W. and Chen, H.-H. (2007). Mining opinions from the web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology, 58(12):1838-1850.
    [Kudo and Matsumoto, 2004] Kudo, T. and Matsumoto, Y. (2004). A boosting algorithm for classification of semi-structured text. In Proceedings of EMNLP 2004.
    [Lafferty et al., 2001] Lafferty, J., McCallum, A., and Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning.
    [Levy and Manning, 2003] Levy, R. and Manning, C. (2003). Is it harder to parse chinese, or the chinese treebank? In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1.
    [Li et al., 2010a] Li, F., Han, C., Huang, M., Zhu, X., Xia, Y.-J., Zhang, S., and Yu, H. (2010a). Structure-aware review mining and summarization. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010).
    [Li et al., 2010b] Li, S., Huang, C.-R., Zhou, G., and Lee, S. Y. M. (2010b). Employing personal/impersonal views in supervised and semi-supervised sentiment classification. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.
    [Li et al., 2010c] Li, S., Lee, S. Y. M., Chen, Y., Huang, C.-R., and Zhou, G. (2010c). Sentiment classification and polarity shifting. In Proceedings of the 23rd International Conference on Computational Linguistics.
    [Lin, 2013] Lin, G.-C. (2013). Using dependency-based mutual information on semi-supervised opinion word-target extraction on restaurant reviews.
    [Liu, 2010] Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of natural language processing, 2:627-666.
    [Liu, 2012] Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.
    [Ma and Chen, 2003] Ma, W.-Y. and Chen, K.-J. (2003). Introduction to ckip chinese word segmentation system for the first international chinese word segmentation bakeoff. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing - Volume 17.
    [Martineau and Finin, 2009] Martineau, J. and Finin, T. (2009). Delta tfidf: An improved feature space for sentiment analysis. In Proceedings of the Third AAAI Internatonal Conference on Weblogs and Social Media.
    [Mikolov et al., 2013] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems.
    [Nakagawa et al., 2010] Nakagawa, T., Inui, K., and Kurohashi, S. (2010). Dependency tree-based sentiment classification using crfs with hidden variables. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.
    [Paltoglou and Thelwall, 2010] Paltoglou, G. and Thelwall, M. (2010). A study of information retrieval weighting schemes for sentiment analysis. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.
    [Pang et al., 2002] Pang, B., Lee, L., and Vaithyanathan, S. (2002). Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10.
    [Pontiki et al., 2014] Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., and Manandhar, S. (2014). Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014).
    [Qiu et al., 2011] Qiu, G., Liu, B., Bu, J., and Chen, C. (2011). Opinion word expansion and target extraction through double propagation. Computational linguistics, 37(1):9-27.
    [Robertson et al., 1995] Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., Gatford, M., et al. (1995). Okapi at trec-3. NIST SPECIAL PUBLICATION SP, pages 109-109.
    [Tsai et al., 2014] Tsai, Y.-L., Tsai, R. T.-H., Chueh, C.-H., and Chang, S.-C. (2014). Cross-domain opinion word identification with query-by-committee active learning. In Cheng, S.-M. and Day, M.-Y., editors, Technologies and Applications of Artificial Intelligence, volume 8916 of Lecture Notes in Computer Science, pages 334-343. Springer International Publishing.
    [Tsoumakas and Katakis, 2007] Tsoumakas, G. and Katakis, I. (2007). Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 2007:1-13.
    [Tsoumakas and Vlahavas, 2007] Tsoumakas, G. and Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In Proceedings of the 18th European Conference on Machine Learning.
    [Turney, 2002] Turney, P. D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.
    [Turney and Littman, 2003] Turney, P. D. and Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315-346.
    [Vapnik et al., 1996] Vapnik, V., Golowich, S. E., and Smola, A. (1996). Support vector method for function approximation, regression estimation, and signal processing. In Advances in Neural Information Processing Systems 9.
    [Wang and Wang, 2008] Wang, B. and Wang, H. (2008). Bootstrapping both product features and opinion words from chinese customer reviews with crossinducing. In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP).
    [Yang and Cardie, 2013] Yang, B. and Cardie, C. (2013). Joint inference for fine-grained opinion extraction. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    [Zhuang et al., 2006] Zhuang, L., Jing, F., and Zhu, X.-Y. (2006). Movie review mining and summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE