The hB-tree: a multiattribute indexing method with good guaranteed performance
ACM Transactions on Database Systems (TODS)
Vector quantization and signal compression
Vector quantization and signal compression
Elements of information theory
Elements of information theory
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
ACM Computing Surveys (CSUR)
An empirical comparison of four initialization methods for the K-Means algorithm
Pattern Recognition Letters
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Clustering Algorithms
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Laplacian Eigenmaps for dimensionality reduction and data representation
Neural Computation
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Clustering with Bregman Divergences
The Journal of Machine Learning Research
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
A method for initialising the K-means clustering algorithm using kd-trees
Pattern Recognition Letters
Learning Spectral Clustering, With Application To Speech Separation
The Journal of Machine Learning Research
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Constructing Minimal Spanning Trees in Coordinate Spaces
IEEE Transactions on Computers
Numerical Recipes 3rd Edition: The Art of Scientific Computing
Numerical Recipes 3rd Edition: The Art of Scientific Computing
On the History of the Minimum Spanning Tree Problem
IEEE Annals of the History of Computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Tree-structured nonlinear signal modeling and prediction
IEEE Transactions on Signal Processing
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Hi-index | 0.08 |
An original approach to cluster multi-component data sets is proposed that includes an estimation of the number of clusters. Using Prim's algorithm to construct a minimal spanning tree (MST) we show that, under the assumption that the vertices are approximately distributed according to a spatial homogeneous Poisson process, the number of clusters can be accurately estimated by thresholding the sequence of edge lengths added to the MST by Prim's algorithm. This sequence, called the Prim trajectory, contains sufficient information to determine both the number of clusters and the approximate locations of the cluster centroids. The estimated number of clusters and cluster centroids are used to initialize the generalized Lloyd algorithm, also known as k-means, which circumvents its well known initialization problems. We evaluate the false positive rate of our cluster detection algorithm, using Poisson approximations in Euclidean spaces. Applications of this method in the multi/hyper-spectral imagery domain to a satellite view of Paris and to an image of Mars are also presented.