Introduction to algorithms
A vector space model for automatic indexing
Communications of the ACM
Modern Information Retrieval
Cluster Analysis
Hi-index | 0.00 |
In this paper we present a new approach for clustering a data set for which the only information available is a similarity measure between every pair of elements. The objective is to partition the set into disjoint subsets such that two elements assigned to the same subset are more likely to have a high similarity measure than elements assigned to different subsets. The algorithm makes no assumption about the size or number of clusters, or of any constraint in the similarity measure. The algorithm relies on very simple operations. The running time is dominated by matrix multiplication, and in some cases curve-fitting. We will present experimental results from various implementations of this method.