研究生: |
方麗娜 Laina Farsiah |
---|---|
論文名稱: |
印尼文情緒偵測於不均衡微網誌資料之研究 Emotion Detection for Unbalanced Indonesian Tweets |
指導教授: |
陳宜欣
Chen,Yi Shin |
口試委員: |
陳朝欽
Chen, Chaur-Chin 韓永楷 Hon, Wing-Kai |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2015 |
畢業學年度: | 104 |
語文別: | 英文 |
論文頁數: | 38 |
中文關鍵詞: | 情緒偵測 、不均衝資料 、維特 、印尼文 |
外文關鍵詞: | Emotion Detection, Unbalanced Data, Twitter, Indonesian Language |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
印尼文情緒偵測於不均衡微網誌資料之研究
近年來,推特資料勘探已成爲研究熱點。而在微網誌上的情緒分析,是眾多研究中的其中一種。最近,相關學者提出了一種基於圖學的情緒模式擷取技術,該技術在多種語言的應用中皆取得良好效果。本研究旨在提升印尼文推特情緒分析的精確度,分析的情緒包括以下八類:開心(senang)、憂傷(sedih)、害怕(takut)、驚訝(terkejut)、噁心(jijik)、希望(antisipasi)、信任(percaya)、生氣(marah)。之前的研究中,印尼文的情緒分析精確度不甚理想,主要原因爲印尼文推特中情緒分佈不均衡。因此本研究提出一種調整情緒模式權重的方法以解決情緒分佈不均衡的問題。實驗結果證明,該方法可(顯著)提高印尼文推特中情緒分析的精確度。
Emotion Detection for Unbalanced Indonesian Tweets
ABSTRACT
Research concerning Twitter mining becomes an interesting research topic in recent years. Emotion
detection is one of research area which uses microblog, such as Twitter, to discover emotions from
textual data. Recently, a novel technique based on graph-based was proposed to extract patterns that
bear emotion. The system has been achieved a good performance in different languages. By adopting
the system, we are motivated to enhance the accuracy of emotion detection for Indonesian language
which consists of eight emotions, i.e. joy (senang), sad (sedih), fear (takut), surprise (terkejut),
disgust (jijik), anticipation (antisipasi), trust (percaya), dan anger (marah). The data distribution
among the emotions is really unbalanced which make the low precision of system for Indonesian
language. In this study, we proposed an adjusting pattern weight to address unbalanced data problem
for Indonesian language. The experiment results show that the proposed approach can improve the
precision for unbalanced Indonesian data.
Bibliography
[1] A. J. McMinn, D. Tsvetkov, T. Yordanov, A. Patterson, R. Szk, J. A. Ro-
driguez Perez, and J. M. Jose, \An interactive interface for visualizing events
on twitter," in Proceedings of the 37th International ACM SIGIR Conference on
Research & Development in Information Retrieval, SIGIR '14, (New York,
NY, USA), pp. 1271{1272, ACM, 2014.
[2] T. Sakaki, M. Okazaki, and Y. Matsuo, \Tweet Analysis for Real-Time Event
Detection and Earthquake Reporting System Development," IEEE Transactions
on Knowledge and Data Engineering, vol. 25, pp. 919{931, Apr. 2013. 00059.
[3] A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, \Predicting
elections with twitter: What 140 characters reveal about political sentiment.,"
ICWSM, vol. 10, pp. 178{185, 2010.
[4] M. Hu and B. Liu, \Mining and summarizing customer reviews," in Proceedings
of the tenth ACM SIGKDD international conference on Knowledge discovery and
data mining, pp. 168{177, ACM, 2004.
[5] B. Pang, L. Lee, and S. Vaithyanathan, \Thumbs up?: sentiment classication
using machine learning techniques," in Proceedings of the ACL-02 conference on
Empirical methods in natural language processing-Volume 10, pp. 79{86, Associ-
ation for Computational Linguistics, 2002.
[6] A. Z. Arin, Y. A. Sari, E. K. Ratnasari, and S. Mutron, \Emotion Detection
of Tweets in Indonesian Language using Non-Negative Matrix Factorization,"International Journal of Intelligent Systems and Applications, vol. 6, pp. 54{61,
Aug. 2014.
[7] C. Argueta, Y.-S. Chen, and E. Saravia, \Unsupervised graph-based patterns
extraction for emotion classication," in 2015 IEEE/ACM International Con-
ference on Advances in Social Networks Analysis and Mining (ASONAM 2015),
2015.
[8] R. Plutchik, \The natural of emotions," American Scientist, vol. 89, pp. 344{350,
July 2001.
[9] A. F. Wicaksono, C. Vania, T. Bayu Distiawan, and M. Adriani, \Automatically
building a corpus for sentiment analysis on indonesian tweets," The 28th Pacic
Asia Conference on Language, Information and Computation, 2014.
[10] V. Wijaya, A. Erwin, M. Galinium, and W. Muliady, \Automatic mood classi-
cation of indonesian tweets using linguistic approach," in International Confer-
ence on Information Technology and Electrical Engineering (ICITEE), pp. 41{46,
IEEE, 2013.
[11] Z. Yuan and M. Purver, \Predicting emotion labels for chinese microblog texts,"
in First International Workshop on Sentiment Discovery from Aective Data
(SDAD), 2012.
[12] W. Wang, L. Chen, K. Thirunarayan, and A. P. Sheth, \Harnessing twitter "big
data" for automatic emotion identication," in Privacy, Security, Risk and Trust
(PASSAT), 2012 International Conference on and 2012 International Conference
on Social Computing (SocialCom), pp. 587{592, IEEE, 2012.
[13] S. Wen and X. Wan, \Emotion Classication in Microblog Texts Using Class
Sequential Rules," in Twenty-Eighth AAAI Conference on Articial Intelligence,
2014.
[14] A. Esmin, R. de Oliveira, S. Matwin, et al., \Hierarchical classication ap-
proach to emotion recognition in twitter," in Machine Learning and Applica-
tions (ICMLA), 2012 11th International Conference on, vol. 2, pp. 381{385,
IEEE, 2012.
[15] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, \Smote: syn-
thetic minority over-sampling technique," Journal of Articial Intelligence Re-
search, pp. 321{357, 2002.
[16] N. Japkowicz and S. Stephen, \The class imbalance problem: A systematic
study," Intelligent Data Analysis, vol. 6, no. 5, pp. 429{449, 2002.
[17] K. Chen, B.-L. Lu, and J. T. Kwok, \Ecient classication of multi-label and
imbalanced data using min-max modular classiers," in International Joint Con-
ference on Neural Networks (IJCNN), pp. 1770{1775, IEEE, 2006.