研究生: |
蔡易霖 Tsai, Yi Lin |
---|---|
論文名稱: |
應用意見詞與面向詞配對於面向類別評價極性分類 Aspect-category-based Sentiment Classification with Aspect-Opinion Relation |
指導教授: |
許聞廉
Hsu, Wen-Lian 蔡宗翰 Tsai, Tzong Han |
口試委員: |
蘇豐文
Soo, Von Wen 張俊盛 Jason S. Chang 王浩全 Wang, Hao Chuan |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications |
論文出版年: | 2015 |
畢業學年度: | 104 |
語文別: | 中文 |
論文頁數: | 58 |
中文關鍵詞: | 意見探勘 、面向類別 |
外文關鍵詞: | opinion mining, aspect category |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,以面向類別為基礎的評價分析逐漸被重視,在這個任務下,需要判斷評論句於某個預先定義的面向類別{食物, 價錢, 服務, 氣氛, 其他}所表達出的評價。在本論文中,我們致力於 SemEval-2014 任務 4 中的兩個子任務:面向類別偵測(Aspect Category Detection,ACD)及面向類別情緒偵測(Aspect Category Polarity Detection,ACP)。在 SemEval-2014 中,多數的隊伍使用 n-grams 與情緒字典作為機器學習主要特徵。然而,對於一些各類別通用的意見詞(例如,「讚」可以用來描述每一個面向類別),這些特徵難以此區分出這些意見所描述的類別。相對地,我們首先抓取意見詞所描述的面向詞,並使用(意見詞,面向詞)配對作為特徵來解決這個困難。做法上,首先利用意見詞辨識系統找出評論句中的意見詞,再用依存規則(dependency rule)來判斷出對應的面向詞。本系統目前已完成於餐廳評論領域。實驗顯示,使用 Word2Vec 作為特徵可以達到 87.5% 的正確率,加上(意見詞,面向詞)配對特徵可以達到 88.3% 正確率。所有的特徵一起使用可以從 84.4% 提升到 89.0%。實驗結果顯示出該配對於面向類別情緒偵測下是有效的。
In recent years, researches of aspect-category-based sentiment analysis have been approached in terms of predefined categories. In this paper, we target two sub-tasks of SemEval-2014 Task 4 dedicated to aspect-based sentiment analysis: detecting aspect category and aspect category polarity. Also, a pre-identified set of aspect categories {food, price, service, ambience, miscellaneous} defined by SemEval-2014 have been used in this paper. The majority of the submissions worked on these two sub-tasks with machine learning mainly with n-grams and sentiment lexicon features. The difficulty for these submissions is that some opinion word (e.g., "good") is general and cannot be referred to any particular category. By contrast, we use aspect-opinion pairs as one of the features in this paper to overcome this difficulty. To detect these pairs, we identify the opinion words in customer reviews, and then detect their related aspect terms by dependency rule. This system has been done on restaurant domain applying to Chinese customer reviews. Our experiment achieved 87.5% of accuracy by using Word2Vec to detect aspect category polarity. Aspect-opinion pair features employed in this system contribute to 88.3% of accuracy. When all features are employed, the accuracy is improved from 84.4% to 89.0%. Experimental results demonstrate the effectiveness of aspect-opinion pair features applied to the aspect-category-based sentiment classification system.
[Chang and Lin, 2011] Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27.
[Choi et al., 2006] Choi, Y., Breck, E., and Cardie, C. (2006). Joint extraction of entities and relations for opinion recognition. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing.
[Choi et al., 2005] Choi, Y., Cardie, C., Riloff, E., and Patwardhan, S. (2005). Identifying sources of opinions with conditional random fields and extraction patterns. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing.
[Cohen, 1960] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37-46.
[Cortes and Vapnik, 1995] Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273-297.
[Crammer and Singer, 2002] Crammer, K. and Singer, Y. (2002). On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research, 2:265-292.
[Cristianini and Shawe-Taylor, 2000] Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 1 edition.
[Cui et al., 2006] Cui, H., Mittal, V., and Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 2.
[Das and Chen, 2001] Das, S. and Chen, M. (2001). Yahoo! for amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific finance association annual conference (APFA).
[Daume III, 2009] Daumme III, H. (2009). Frustratingly easy domain adaptation. In Conference of the Association for Computational Linguistics (ACL).
[Ganu et al., 2009] Ganu, G., Elhadad, N., and Marian, A. (2009). Beyond the stars: Improving rating predictions using review text content. In WebDB.
[Hu and Liu, 2004] Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[Jin et al., 2009] Jin, W., Ho, H. H., and Srihari, R. K. (2009). Opinionminer: A novel machine learning system for web opinion mining and extraction. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[Joachims, 1999] Joachims, T. (1999). Advances in kernel methods. chapter Making Large-scale Support Vector Machine Learning Practical, pages 169-184. MIT Press.
[Kiritchenko et al., 2014] Kiritchenko, S., Zhu, X., Cherry, C., and Mohammad, S. (2014). Nrc-canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014).
[Klein and Manning, 2003] Klein, D. and Manning, C. D. (2003). Fast exact inference with a factored model for natural language parsing. In In Advances in Neural Information Processing Systems 15 (NIPS.
[Kobayashi et al., 2007] Kobayashi, N., Inui, K., and Matsumoto, Y. (2007). Ex- tracting aspect-evaluation and aspect-of relations in opinion mining. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).
[Ku and Chen, 2007] Ku, L.-W. and Chen, H.-H. (2007). Mining opinions from the web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology, 58(12):1838-1850.
[Kudo and Matsumoto, 2004] Kudo, T. and Matsumoto, Y. (2004). A boosting algorithm for classification of semi-structured text. In Proceedings of EMNLP 2004.
[Lafferty et al., 2001] Lafferty, J., McCallum, A., and Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning.
[Levy and Manning, 2003] Levy, R. and Manning, C. (2003). Is it harder to parse chinese, or the chinese treebank? In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1.
[Li et al., 2010a] Li, F., Han, C., Huang, M., Zhu, X., Xia, Y.-J., Zhang, S., and Yu, H. (2010a). Structure-aware review mining and summarization. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010).
[Li et al., 2010b] Li, S., Huang, C.-R., Zhou, G., and Lee, S. Y. M. (2010b). Employing personal/impersonal views in supervised and semi-supervised sentiment classification. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.
[Li et al., 2010c] Li, S., Lee, S. Y. M., Chen, Y., Huang, C.-R., and Zhou, G. (2010c). Sentiment classification and polarity shifting. In Proceedings of the 23rd International Conference on Computational Linguistics.
[Lin, 2013] Lin, G.-C. (2013). Using dependency-based mutual information on semi-supervised opinion word-target extraction on restaurant reviews.
[Liu, 2010] Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of natural language processing, 2:627-666.
[Liu, 2012] Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.
[Ma and Chen, 2003] Ma, W.-Y. and Chen, K.-J. (2003). Introduction to ckip chinese word segmentation system for the first international chinese word segmentation bakeoff. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing - Volume 17.
[Martineau and Finin, 2009] Martineau, J. and Finin, T. (2009). Delta tfidf: An improved feature space for sentiment analysis. In Proceedings of the Third AAAI Internatonal Conference on Weblogs and Social Media.
[Mikolov et al., 2013] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems.
[Nakagawa et al., 2010] Nakagawa, T., Inui, K., and Kurohashi, S. (2010). Dependency tree-based sentiment classification using crfs with hidden variables. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.
[Paltoglou and Thelwall, 2010] Paltoglou, G. and Thelwall, M. (2010). A study of information retrieval weighting schemes for sentiment analysis. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.
[Pang et al., 2002] Pang, B., Lee, L., and Vaithyanathan, S. (2002). Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10.
[Pontiki et al., 2014] Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., and Manandhar, S. (2014). Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014).
[Qiu et al., 2011] Qiu, G., Liu, B., Bu, J., and Chen, C. (2011). Opinion word expansion and target extraction through double propagation. Computational linguistics, 37(1):9-27.
[Robertson et al., 1995] Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., Gatford, M., et al. (1995). Okapi at trec-3. NIST SPECIAL PUBLICATION SP, pages 109-109.
[Tsai et al., 2014] Tsai, Y.-L., Tsai, R. T.-H., Chueh, C.-H., and Chang, S.-C. (2014). Cross-domain opinion word identification with query-by-committee active learning. In Cheng, S.-M. and Day, M.-Y., editors, Technologies and Applications of Artificial Intelligence, volume 8916 of Lecture Notes in Computer Science, pages 334-343. Springer International Publishing.
[Tsoumakas and Katakis, 2007] Tsoumakas, G. and Katakis, I. (2007). Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 2007:1-13.
[Tsoumakas and Vlahavas, 2007] Tsoumakas, G. and Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In Proceedings of the 18th European Conference on Machine Learning.
[Turney, 2002] Turney, P. D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.
[Turney and Littman, 2003] Turney, P. D. and Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315-346.
[Vapnik et al., 1996] Vapnik, V., Golowich, S. E., and Smola, A. (1996). Support vector method for function approximation, regression estimation, and signal processing. In Advances in Neural Information Processing Systems 9.
[Wang and Wang, 2008] Wang, B. and Wang, H. (2008). Bootstrapping both product features and opinion words from chinese customer reviews with crossinducing. In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP).
[Yang and Cardie, 2013] Yang, B. and Cardie, C. (2013). Joint inference for fine-grained opinion extraction. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
[Zhuang et al., 2006] Zhuang, L., Jing, F., and Zhu, X.-Y. (2006). Movie review mining and summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management.