Exploiting small world property for network clustering

  • Authors:
  • Tieyun Qian;Qing Li;Jaideep Srivastava;Zhiyong Peng;Yang Yang;Shuo Wang

  • Affiliations:
  • State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China and State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China and Department of Computer Science, City University of Hong Kong, Kowloon Tong, HongKong;College of Science& Engineering, University of Minnesota, Minneapolis, USA 55453;State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China;State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China;State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China

  • Venue:
  • World Wide Web
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graph partitioning is a traditional problem with many applications and a number of high-quality algorithms have been developed. Recently, demand for social network analysis arouses the new research interest on graph partitioning/clustering. Social networks differ from conventional graphs in that they exhibit some key properties like power-law and small-world property. Currently, these features are largely neglected in popular partitioning algorithms. In this paper, we present a novel framework which leverages the small-world property for finding clusters in social networks. The framework consists of several key features. Firstly, we define a total order, which combines the edge weight, the small-world weight, and the hub value, to better reflect the connection strength between two vertices. Secondly, we design a strategy using this ordered list, to greedily, yet effectively, refine existing partitioning algorithms for common objective functions. Thirdly, the proposed method is independent of the original approach, such that it could be integrated with any types of existing graph clustering algorithms. We conduct an extensive performance study on both real-life and synthetic datasets. The empirical results clearly demonstrate that our framework significantly improves the output of the state-of-the-art methods. Furthermore, we show that the proposed method returns clusters with both internal and external higher qualities.