研究生: |
謝文婷 Hsieh, Wen-Ting |
---|---|
論文名稱: |
大型網路中以核心為基礎之社群偵測 Core-Based Community Detection in Large-Scale Networks |
指導教授: |
張正尚
Chang, Cheng-Shang |
口試委員: |
李端興
林華君 黃之浩 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 通訊工程研究所 Communications Engineering |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | 社群偵測 、巨量資料 、大型網路 |
外文關鍵詞: | community detection, big data, large-scale networks |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的發展,可搜集到的資料量驟增。與之所對應的網路變得大型與複雜。當我們社群偵測大型網路時,大多數現有的方法已不太適用。其原因是,大部份的演算法在處理網路時,每一階段都需查找網路中所有節點,但在大型網路是難以處理的,因為時間複雜度與計算複雜度都是急劇成長。根據上述原因,我們發展一個在大型網路的社群偵測演算法。
在我們的論文中,我們定義一個子集合可以代表該集合的核心。在大型網路中實作社群偵測時,我們可以忽略集合並專注於代表該集合的核心。我們提出一個以核心為基礎的區域社群演算法,來證實集合的核心可以代表該集合,並且通過使用LFR基準圖形和DBLP網路(以作者為中心的論文合作網路)測試演算法。其演算法在社群強度大的LFR基準圖形表現良好,可以達到幾乎100%的查準率(Precision)和查全率(Recall)。然而,在DBLP網路則表現不佳,我們探討其原因是DBLP網路有重疊社群(overlapping communities)。
最後我們發展一個在大型網路中以核心為基礎的社群偵測演算法,此演算法不需要在每一個階段都查找網路中所有節點。通過LFR基準網路和DBLP網路,我們找出一個適當的核心參數,並且和其他大型網路社群偵測演算法比較分群評價優劣。
With the technological development, the volume of collected data has been increased very rapidly. The corresponding graphs become much larger and more complex. When we process the large-scale networks, the most existing methods [1] do not work. The reason is that most of the algorithms need to trace the whole network for each step, but the network is too huge to handle. Both the time complexity and the computational complexity have grown up rapidly. For this reason, we are interested in developing a community detection algorithm for solving the large-scale networks without tracing the whole network for each step.
In our thesis, we define the core of a set that can be used to represent the set. In large-scale networks we can ignore the set S and focus on the core of the set S during the process of community detection. We propose an algorithm, called the core-based local community detection algorithm, to verify that the core of a set can represent the set, and test the algorithm by using the LFR benchmark graphs and the DBLP co- authorship network. The core-based local community detection algorithm performs well in the LFR benchmark graphs. For communities with strong community strength in the LFR benchmark graphs, this approach could reach almost 100% precision and 100% recall. However, the core-based local community detection algorithm does not performs well when the communities of the networks have overlapping communities.
Finally we develop a core-based community detection algorithm for large-scale net- works, which needs not trace the whole network for each step, and also applies the algorithm to the LFR benchmark graphs and the DBLP co-authorship network. We also compare with three different methods, the label propagation, the fast unfolding and the greedy optimization of modularity, respectively.
[1] Santo Fortunato. Community detection in graphs. Physics Reports, 486(3):75–174, 2010.
[2] Ulrik Brandes, Garry Robins, Ann Mccranie, and Stanley Wasserman. What is network science? Network Science, 1(1), 2013.
[3] Mason A Porter, Jukka-Pekka Onnela, and Peter J Mucha. Communities in net- works. Notices of the AMS, 56(9):1082–1097, 2009.
[4] Mark EJ Newman and Michelle Girvan. Finding and evaluating community structure in networks. Physical review E, 69(2):026113, 2004.
[5] Filippo Radicchi, Claudio Castellano, Federico Cecconi, Vittorio Loreto, and Domenico Parisi. Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101(9):2658– 2663, 2004.
[6] Fang Wu and Bernardo A Huberman. Finding communities in linear time: a physics approach. The European Physical Journal B-Condensed Matter and Complex Sys- tems, 38(2):331–338, 2004.
[7] Jordi Duch and Alex Arenas. Community detection in complex networks using extremal optimization. Physical review E, 72(2):027104, 2005.
[8] Usha Nandini Raghavan, R ́eka Albert, and Soundar Kumara. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3):036106, 2007.
36
[9] Jaewon Yang and Jure Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, page 3. ACM, 2012.
[10] Mark EJ Newman. Fast algorithm for detecting community structure in networks. Physical review E, 69(6):066133, 2004.
[11] Aaron Clauset, Mark EJ Newman, and Cristopher Moore. Finding community struc- ture in very large networks. Physical review E, 70(6):066111, 2004.
[12] Andrea Lancichinetti, Santo Fortunato, and Ja ́nos Kert ́esz. Detecting the overlap- ping and hierarchical community structure in complex networks. New Journal of Physics, 11(3):033015, 2009.
[13] Inderjit S Dhillon, Yuqiang Guan, and Brian Kulis. Kernel k-means: spectral clus- tering and normalized cuts. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 551–556. ACM, 2004.
[14] Brian Kulis, Sugato Basu, Inderjit Dhillon, and Raymond Mooney. Semi-supervised graph clustering: a kernel approach. Machine Learning, 74(1):1–22, 2009.
[15] Bo Long, Zhongfei Mark Zhang, and Philip S Yu. A probabilistic framework for rela- tional clustering. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 470–479. ACM, 2007.
[16] Brian Karrer and Mark EJ Newman. Stochastic block models and community struc- ture in networks. Physical Review E, 83(1):016107, 2011.
[17] Martin Rosvall and Carl T Bergstrom. An information-theoretic framework for resolving community structure in complex networks. Proceedings of the National Academy of Sciences, 104(18):7327–7331, 2007.
[18] Martin Rosvall and Carl T Bergstrom. Maps of random walks on complex net- works reveal community structure. Proceedings of the National Academy of Sciences, 105(4):1118–1123, 2008.
37
[19] Gergely Palla, Imre Der ́enyi, Ill ́es Farkas, and Tama ́s Vicsek. Uncovering the over- lapping community structure of complex networks in nature and society. Nature, 435(7043):814–818, 2005.
[20] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. Comparing com- munity structure identification. Journal of Statistical Mechanics: Theory and Ex- periment, 2005(09):P09008, 2005.
[21] Andrea Lancichinetti and Santo Fortunato. Community detection algorithms: A comparative analysis. Physical review E, 80(5):056117, 2009.
[22] Jure Leskovec, Kevin J Lang, and Michael Mahoney. Empirical comparison of al- gorithms for network community detection. In Proceedings of the 19th international conference on World wide web, pages 631–640. ACM, 2010.
[23] Cheng-Shang Chang, Chih-Jung Chang, and Duan-Shin Lee. Relative centrality and local community detection. Master’s thesis, National Tsing Hua University, 2013.
[24] Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81–90. ACM, 2004.
[25] Aaron Clauset. Finding local community structure in networks. Physical review E, 72(2):026132, 2005.
[26] Reid Andersen, Fan Chung, and Kevin Lang. Local graph partitioning using pager- ank vectors. In Foundations of Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on, pages 475–486. IEEE, 2006.
[27] Reid Andersen and Kevin J Lang. Communities from seed sets. In Proceedings of the 15th international conference on World Wide Web, pages 223–232. ACM, 2006.
[28] Xuan-Chao Huang, Jay Cheng, Hsin-Hung Chou, Chih-Heng Cheng, and Hsien- Tsan Chen. Detecting overlapping communities in networks based on a simple node bahavior model. Preprint, 2013.
38
[29] Michelle Girvan and Mark EJ Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12):7821–7826, 2002.
[30] Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4):046110, 2008.
[31] Jure Leskovec, Polo Chau, and Ana Pavlisic. Stanford network analysis platform. http://snap.stanford.edu/data/index.html.
[32] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.