ESC: An efficient synchronization-based clustering algorithm

Authors:
Jianbin Huang;Heli Sun;Jianmei Kang;Junjie Qi;Hongbo Deng;Qinbao Song
Affiliations:
School of Software, Xidian University, Xi'an 710071, China;Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China;School of Software, Xidian University, Xi'an 710071, China;School of Software, Xidian University, Xi'an 710071, China;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA;Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
Venue:
Knowledge-Based Systems
Year:
2013

Citing 12
Cited 0

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mean Shift: A Robust Approach Toward Feature Space Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
Information theoretic measures for clusterings comparison: is a correction for chance necessary?

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Clustering by synchronization

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Data clustering with size constraints

Knowledge-Based Systems
Synchronization based outlier detection

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Understanding of Internal Clustering Validation Measures

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Detection of Arbitrarily Oriented Synchronized Clusters in High-Dimensional Data

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is an essential approach for detecting the intrinsic groups in data. An efficient clustering algorithm based on a generalized local synchronization model is proposed. It uses a novel stopping criterion of data synchronization to detect clusters prior to the perfect synchronization. Moreover, a density-biased sampling method is adopted to extract samples from the original data set. The clustering structure can be effectively revealed on the samples. As a result, the clustering efficiency is significantly improved. By using a cluster validity criterion, the proposed algorithm can find clusters of arbitrary number, shape, size and density as well as isolate noises in the vector data without any data distribution assumption. Extensive experiments on several synthetic and real-world data sets show that the proposed algorithm possesses high accuracy and it is more efficient than the state-of-the-art synchronization-based clustering method.