Clustering techniques for large data sets—from the past to the future
KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating exact k-means algorithms with geometric reasoning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Relative Unsupervised Discretization for Association Rule Mining
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Segmentation Using Eigenvectors: A Unifying View
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
A Framework for Experimental Evaluation of Clustering Techniques
IWPC '00 Proceedings of the 8th International Workshop on Program Comprehension
In Search of the Horowitz Factor: Interim Report on a Musical Discovery Project
DS '02 Proceedings of the 5th International Conference on Discovery Science
Hi-index | 0.00 |
We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example's cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.