Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
How many clusters are best?—an experiment
Pattern Recognition
Algorithms for clustering data
Algorithms for clustering data
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Parallel Algorithms for Hierarchical Clustering and Cluster Validity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Information retrieval
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Pattern recognition and image analysis
Pattern recognition and image analysis
Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Scaling mining algorithms to large databases
Communications of the ACM - Evolving data mining into solutions for insights
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Principal Direction Divisive Partitioning
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering Large Datasets in Arbitrary Metric Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem
Journal of Global Optimization
Journal of Global Optimization
Automatic detection of cohesive subgroups within social hypertext: A heuristic approach
The New Review of Hypermedia and Multimedia
Identification of association rules between clusters
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Application of K-Medoids with Kd-Tree for Software Fault Prediction
ACM SIGSOFT Software Engineering Notes
An efficient algorithm for maximal margin clustering
Journal of Global Optimization
Fast rank-2 nonnegative matrix factorization for hierarchical document clustering
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Timeline generation: tracking individuals on twitter
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Clustering has been widely used to partition data into groups so that the degree of association is high among members of the same group and low among members of different groups. Though many effective and efficient clustering algorithms have been developed and deployed, most of them still suffer from the lack of automatic or online decision for optimal number of clusters. In this paper, we define clustering gain as a measure for clustering optimality, which is based on the squared error sum as a clustering algorithm proceeds. When the measure is applied to a hierarchical clustering algorithm, an optimal number of clusters can be found. Our clustering measure shows good performance producing intuitively reasonable clustering configurations in Euclidean space according to the evidence from experimental results. Furthermore, the measure can be utilized to estimate the desired number of clusters for partitional clustering methods as well. Therefore, the clustering gain measure provides a promising technique for achieving a higher level of quality for a wide range of clustering methods.