Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Modeling the manifolds of images of handwritten digits
IEEE Transactions on Neural Networks
Assessment and pruning of hierarchical model based clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Knowledge and Data Engineering
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
Mean shift based clustering of Hough domain for fast line segment detection
Pattern Recognition Letters
MESO: Supporting Online Decision Making in Autonomic Computing Systems
IEEE Transactions on Knowledge and Data Engineering
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
An efficient clustering approach for large document collections
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A Graph Analytical Approach for Topic Detection
ACM Transactions on Internet Technology (TOIT)
Hi-index | 0.00 |
The goal of clustering is to identify distinct groups in a dataset. Compared to non-parametric clustering methods like complete linkage, hierarchical model-based clustering has the advantage of offering a way to estimate the number of groups present in the data. However, its computational cost is quadratic in the number of items to be clustered, and it is therefore not applicable to large problems. We review an idea called Fractionation, originally conceived by Cutting, Karger, Pedersen and Tukey for non-parametric hierarchical clustering of large datasets, and describe an adaptation of Fractionation to model-based clustering. A further extension, called Refractionation, leads to a procedure that can be successful even in the difficult situation where there are large numbers of small groups.