Algorithms for clustering data
Algorithms for clustering data
ACM Computing Surveys (CSUR)
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Performance Evaluation of Some Clustering Algorithms and Validity Indices
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster validation techniques for genome expression data
Signal Processing - Special issue: Genomic signal processing
A new cluster validity measure and its application to image compression
Pattern Analysis & Applications
New indices for cluster validity assessment
Pattern Recognition Letters
Text mining without document context
Information Processing and Management: an International Journal - Special issue: Informetrics
An objective approach to cluster validation
Pattern Recognition Letters
Alarm clustering for intrusion detection systems in computer networks
Engineering Applications of Artificial Intelligence
Acquisition of concept descriptions by conceptual clustering
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Some new indexes of cluster validity
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Search Engine Query Clustering Using Top-k Search Results
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
A two-stage genetic algorithm for automatic clustering
Neurocomputing
An extensive comparative study of cluster validity indices
Pattern Recognition
Hi-index | 0.00 |
Clustering is one of the most well known types of unsupervised learning. Evaluating the quality of results and determining the number of clusters in data is an important issue. Most current validity indices only cover a subset of important aspects of clusters. Moreover, these indices are relevant only for data sets containing at least two clusters. In this paper, a new bounded index for cluster validity, called the score function (SF), is introduced. The score function is based on standard cluster properties. Several artificial and real-life data sets are used to evaluate the performance of the score function. The score function is tested against four existing validity indices. The index proposed in this paper is found to be always as good or better than these indices in the case of hyperspheroidal clusters. It is shown to work well on multi-dimensional data sets and is able to accommodate unique and sub-cluster cases.