BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Clustering techniques for large data sets—from the past to the future
KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Self-Organization of Pulse-Coupled Oscillators with Application to Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
A Synchronization Based Algorithm for Discovering Ellipsoidal Clusters in Large Datasets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
IEEE Transactions on Fuzzy Systems
A new data clustering approach: Generalized cellular automata
Information Systems
Hi-index | 0.00 |
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. SyMP represents each data point by an Integrate-and-Fire oscillator and uses the relative similarity between the points to model the interaction between the oscillators. SyMP is robust to noise and outliers, determines the number of clusters in an unsupervised manner, identifies clusters of arbitrary shapes, and can handle very large data sets. The robustness of SyMP is an intrinsic property of the synchronization mechanism. To determine the optimum number of clusters, SyMP uses a dynamic resolution parameter. To identify clusters of various shapes, SyMP models each cluster by multiple Gaussian components. The number of components is automatically determined using a dynamic intra-cluster resolution parameter. Clusters with simple shapes would be modeled by few components while clusters with more complex shapes would require a larger number of components. The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set. The proposed clustering approach is empirically evaluated with several synthetic and real data sets, and its performance is compared with CURE.