CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
A new distributed data mining model based on similarity
Proceedings of the 2003 ACM symposium on Applied computing
Hi-index | 0.00 |
The clustering problem has been widely studied since it arises in many application domains in engineering, business and social science. It aims at identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity clusters. Traditional clustering algorithms use distance functions to measure similarity and are not suitable for high dimensional spaces. In this paper, we propose a non-distance based clustering algorithm for high dimensional spaces. Based on the maximum likelihood principle, the algorithm is to optimize parameters to maximize the likelihood between data points and the model generated by the parameters. Experimental results on both synthetic data sets and a real data set show the efficiency and effectiveness of the algorithm.