研究生: |
謝祥彥 Hsieh, Hsiang Yen |
---|---|
論文名稱: |
透過分類法與社會網路分析研究惡意電話之行為 Spam Calls Analysis Using Classification and Social Network Analysis |
指導教授: |
王俊程
Wang, Jyun Cheng |
口試委員: |
王貞雅
Wang, Chen Ya 江成欣 Chiang, Cheng Hsin |
學位類別: |
碩士 Master |
系所名稱: |
科技管理學院 - 服務科學研究所 Institute of Service Science |
論文出版年: | 2016 |
畢業學年度: | 104 |
語文別: | 英文 |
論文頁數: | 57 |
中文關鍵詞: | 惡意電話 、詐騙電話 、騷擾電話 、行銷電話 、分類樹 、邏輯回歸分析 、社會網路分析 |
外文關鍵詞: | spam call, fraud call, harassed call, marketing call, classification tree, logistic regression, social network analysis |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
惡意電話在現實世界中層出不窮,根據調查就台灣而言,平均每月台灣人浪費 15 萬小時講惡意電話,主要包含了詐騙電話、騷擾電話與行銷電話,而每年因為詐騙電話所造成的損失就超過了37億台幣。然而過去的研究都只針對偵測惡意電話,不去探究其是否為嚴重的詐騙或一般的行銷電話。
本篇論文中,我們分析知名電話偵測APP的惡意電話資料,其中包含了這些惡意電話的種類、通話時間、通話日期等,這些資料經過前置處理後將其合併成適合分析的階段。之後利用過採樣來消除資料不平衡的問題,並透過多重邏輯迴歸分析解決多類別的分類目的,得出一個可以分類三種惡意電話的模型。另一方面也透過社會網路分析,找出不同種類惡意電話中的交集,更有利於我們區別其是否為惡意電話。
透過本篇論文除了能對惡意電話的行為有進一步的了解之外,也可以透過分析結果發現不同類別內的相似性。而我們新的檢測方法相較於過去而言,也能夠進一步將惡意電話區分成三個類別。
Spam calls are everywhere. According to a research study, Taiwanese wastes almost 150,000 hours on spam calls per month. Spam calls include Fraud, Harassed and Marketing. Moreover, we lost 3.7 billion NTD every year because of the Fraud call. Although there are many studies talking about spam calls detection, few of them try to classify the category of spam calls.
In this research, we obtain a huge dataset about spam calls’ call logs that include the category, duration and date. First, we run data preprocess and data aggregation, then use oversampling to overcome the problem of imbalanced data. In addition, we implement multiple models of logistic regression to solve the multi-class classification, and then build models that can classify spam calls into three categories. We also use social network analysis to find out the social relationship of calls within some subgroups.
In conclusion, different spam calls have exactly different behaviors. It is possible to identify them by using classification and social network analysis. However, spammers’ behavior may change as the time goes by, doing analysis once and for all is impossible. It is necessary to train new model routinely to overcome the changing behavior.
Bokharaei, H. K., Sahraei, A., Ganjali, Y., Keralapura, R., & Nucci, A. (2011). You can SPIT, but you cannot hide: Spammer Identification in Telephony Networks. 2011 Proceedings Ieee Infocom, 41-45.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees: Taylor & Francis.
Catanese, S., Ferrara, E., & Fiumara, G. (2012). Forensic analysis of phone call networks. Social Network Analysis and Mining, 3(1), 15-33.
Chaisamran, N., Okuda, T., Blanc, G., & Yamaguchi, S. (2011). Trust-Based VoIP Spam Detection Based on Call Duration and Human Relationships.
Coffman, T. R., & Marcus, S. E. (2004). Pattern classification in social network analysis: A case study. Paper presented at the 2004 Ieee Aerospace Conference Proceedings, Vols 1-6. <Go to ISI>://WOS:000225274000320
Dev, P., Singh, K., & Dhawan, S. (2015). Classification of malicious and legitimate nodes for analysing the users' behaviour in heterogeneous online social networks. 359-363.
Dongwook, S., Jinyoung, A., & Choon, S. (2006). Progressive multi gray-leveling: a voice spam protection algorithm. IEEE Network, 20(5), 18-24.
Farseev, A., Nie, L., Akbari, M., & Chua, T.-S. (2015). Harvesting Multiple Sources for User Profile Learning. 235-242.
Garton, L., Haythornthwaite, C., & Wellman, B. (1997). Studying Online Social Networks. Journal of Computer-Mediated Communication, 3(1), 0-0.
Hawkins, D. M. (2004). The problem of overfitting. J Chem Inf Comput Sci, 44(1), 1-12.
Haythornthwaite, C. (1996). Social network analysis: An approach and technique for the study of information exchange. Library & Information Science Research, 18(4), 323-342.
Jabeur Ben Chikha, R., Abbes, T., Ben Chikha, W., & Bouhoula, A. (2015). Behavior-based approach to detect spam over IP telephony attacks. International Journal of Information Security, 15(2), 131-143.
Kurata, M., Toyoda, K., & Sasase, I. (2015). Two-stage SPIT detection scheme with betweenness centrality and social trust. 289-293.
Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32(3), 245-251.
Rahman, M. M., & Davis, D. N. (2013). Addressing the Class Imbalance Problem in Medical Datasets. International Journal of Machine Learning and Computing, 224-228.
Shmueli, G., Patel, N. R., & Bruce, P. C. (2010). Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner: Wiley Publishing.
Wang, T., Krim, H., & Viniotis, Y. (2013). A Generalized Markov Graph Model: Application to Social Network Analysis. Paper presented at the IEEE Journal of Selected Topics in Signal Processing.
Watts, D. J., Dodds, P. S., & Newman, M. E. (2002). Identity and search in social networks. Science, 296(5571), 1302-1305.
Ye, Q., Zhu, T., Hu, D. Y., Wu, B., Du, N., & Wang, B. (2008). Cell Phone Mini Challenge Award: Social Network Accuracy-Exploring Temporal Communication in Mobile Call Graphs. Ieee Symposium on Visual Analytics Science and Technology 2008, Proceedings, 207-208.