研究生: |
李孟潔 Lee, Meng-Chieh |
---|---|
論文名稱: |
利用機器學習作法之中文意見分析 Opinion Analysis of Chinese Text using Machine Learning |
指導教授: |
張俊盛
Chang, Jason S. |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2009 |
畢業學年度: | 97 |
語文別: | 中文 |
論文頁數: | 34 |
中文關鍵詞: | 意見分析 、機器學習 、情緒 、評論 、分類 |
外文關鍵詞: | opinion analysis, semantic orientation, machine learning |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文致力於研究評論文章的評價等級類別,提出一個平價分類系統能自動分類評論文章其所相對應的評價等級類別。本研究係利用擷取出訓練文章中的各種特徵,並使用機器學習訓練模組求得各個特徵與等級類別之間的相關性,進而訓練出一套評價分類系統。
我們採用最大熵值法(Maximum Entropy, ME)作為我們的機器訓練模組,在訓練過程中,我們利用網路上所收集而來的評論相關文章與一部分類辭典,擷取出具有意見的詞彙、片語等特徵,並將這些特徵集送入ME作訓練,最後求得一套評價分類模組。在執行階段,輸入一篇評論文章,利用上述特徵擷取方式求得特徵集,最後利用訓練而得的分類模組輸出相對應的評價等級。
This paper concentrates on the study of opinion classification task. We propose a method to automatically class review with appropriate evaluation category. Our method utilizes extracting features from the training data and uses machine learning algorithm to train an evaluation classification system.We select Maximum Entropy (ME) as our machine learning module. In training time, we utilized the review corpus from the Web and a category dictionary to extract opinion words and phrases as our feature set. And then we used ME to train with feature set to get an evaluation classification module. At run time, a given review is automatically transformed into a feature set and sent to a classification module, and then return a suitable evaluation.
OSGOOD, C. E., SUCI, G. J., AND TANNENBAUM, P. H. 1957. The Measurement of Meaning. University of Illinois Press, Chicago, Ill.
HATZIVASSILOGLOU, V. ANDMCKEOWN, K. R. 1997. Predicting the Semantic Orientation of Adjectives. In Proc. of the 35th ACL/8th EACL, pages 174-181.
HATZIVASSILOGLOU, V. AND WIEBE, J. M. 2000. Effects of Adjective Orientation and Gradability on Sentence Subjectivity. In Proc. of COLING.
KAMPS, J. AND MARX, M. 2002. Words with Attitude. In 1st International WordNet Conference.
PANG, B., LEE, L., AND VAITHYANATHAN, S. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proc. of EMNLP 2002.
TURNEY, P. D. 2002. Thumbs up or Thumbs down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In Proc. of the ACL.
WIEBE, J. M. 2000. Learning Subjective Adjectives from Corpora. In AAAI/IAAI, pages 735–740, 2000.
WIEBE, J. M., BRUCE, R., BELL, M., MARTIN, M., & WILSON, T. 2001. A Corpus Study of Evaluative and Speculative Language. In Proceedings of the 2nd ACL SIGdial Workshop on Discourse and Dialogue.
Somasundaran, S., Ruppenhofer, J., & Wiebe, J. (2007). Detecting Arguing and Sentiment in Meetings. In SIGdial Workshop on Discourse and Dialogue .
Turney, P. and M. L. Littman: 2003. Measuring Praise and Criticism: Inference of Semantic Orientation from Association. In ACM Transactions on Information Systems (TOIS), pages 315–346, 2003.
WILSON, T., WIEBE, J., and HOFFMANN, P. 2005. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. In Proceedings of HLT/EMNLP 2005, Vancouver, Canada
A. Esuli and F. Sebastiani. PageRanking Wordnet Synsets: An Application to Opinion Mining. In Proceedings of ACL’07.
H. Takamura, T. Inui, and M. Okumura. 2007. Extracting Semantic Orientations of Phrases from Dictionary. In Proceedings of NAACL/HLT 2007.
Ku, L-W., Liang, Y-T. and Chen, H-H. Opinion Extraction, Summarization and Tracking in News and Blog Corpora. In AAAICAAW' 06, 2006.
Dave, K., Lawrence, S., and Pennock, D., 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. In WWW’03.
Yang, C.-H., Lin, H.-Y., and Chen, H.-H. Emotion Classification Using Web Blog Corpora. In Proceedings of 2007 IEEE/WIC/ACM International Conference on Web Intelligence.
Spertus, E. 1997. Smokey: Automatic Recognition of Hostile Messages. In Proc. of Innovative Applications of Artifcial Intelligence (IAAI), pages 1058-1065.
Yang, C.H., Lin, K.H.Y. and Chen, H.H. Building Emotion Lexicon from Weblog Corpora. In Proceedings of 45th annual meeting of association for computational linguistics (ACL).
Mishne. Experiments with Mood Classification in Blog Posts. In Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access.
Le Zhang. 2004. Maximum Entropy Modeling Toolkit for Python and C++. Available at http://homepa.ges.inf.ed.ac.uk/s0450736/maxent_toolkit.html.