Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set

Authors:
Maria Halkidi;Michalis Vazirgiannis
Affiliations:
-;-
Venue:
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Year:
2001

Citing 0
Cited 42

Clustering validity checking methods: part II

ACM SIGMOD Record
Knowledge Accumulation and Resolution of Data Inconsistencies during the Integration of Microbial Information Sources

IEEE Transactions on Knowledge and Data Engineering
A Shrinking-Based Clustering Approach for Multidimensional Data

IEEE Transactions on Knowledge and Data Engineering
A Framework for Semi-Supervised Learning Based on Subjective and Objective Clustering Criteria

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
New indices for cluster validity assessment

Pattern Recognition Letters
A shrinking-based approach for multi-dimensional data analysis

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A clustering framework based on subjective and objective validity criteria

ACM Transactions on Knowledge Discovery from Data (TKDD)
A density-based cluster validity approach using multi-representatives

Pattern Recognition Letters
Network snomaly detection based on semi-supervised clustering

SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
Cluster validity measurement for arbitrary shaped clusters

AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
Cluster validity measurement techniques

AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
Aggregation pheromone density based data clustering

Information Sciences: an International Journal
An overview of clustering methods

Intelligent Data Analysis
Clustering Quality and Topology Preservation in Fast Learning SOMs

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Image recognition system based on novel measures of image similarity and cluster validity

Neurocomputing
Use of aggregation pheromone density for image segmentation

Pattern Recognition Letters
Cluster validation using information stability measures

Pattern Recognition Letters
A quantum-inspired genetic algorithm for k-means clustering

Expert Systems with Applications: An International Journal
A clustering validity assessment index

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Fractional particle swarm optimization in multidimensional search space

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Flocking based approach for data clustering

Natural Computing: an international journal
Long distance bigram models applied to word clustering

Pattern Recognition
An efficient preprocessing stage for the relationship-based clustering framework

Intelligent Data Analysis
Automatic and incremental generation of membership functions

ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
Best clustering configuration metrics: towards multiagent based clustering

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Band correction in random amplified polymorphism DNA images using hybrid genetic algorithms with multilevel thresholding

IWINAC'11 Proceedings of the 4th international conference on Interplay between natural and artificial computation: new challenges on bioinspired applications - Volume Part II
Efficient selectivity estimation by histogram construction based on subspace clustering

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
From alternative clustering to robust clustering and its application to gene expression data

IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
Supervised and unsupervised landuse map generation from remotely sensed images using ant based systems

Applied Soft Computing
A classification of cluster validity indexes based on membership degree and applications

WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part I
Model order selection for multiple cooperative swarms clustering using stability analysis

Information Sciences: an International Journal
Efficient content-based image retrieval using Multiple Support Vector Machines Ensemble

Expert Systems with Applications: An International Journal
Multi-objective genetic algorithm based clustering approach and its application to gene expression data

ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Clustering analysis of water quality for canals in bangkok, thailand

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part III
Aggregation pheromone density based image segmentation

ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
Estimation of the number of clusters using multiple clustering validity indices

MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
Integrating wavelets with clustering and indexing for effective content-based image retrieval

Knowledge-Based Systems
An extensive comparative study of cluster validity indices

Pattern Recognition
ASCCN: Arbitrary Shaped Clustering Method with Compatible Nucleoids

International Journal of Data Warehousing and Mining
Classifying Immunophenotypes With Templates From Flow Cytometry

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Usage Profiles: A Process for Discovering Usage Patterns over Web Services and its Application to Service Evolution

International Journal of Web Services Research
Learning a taxonomy of predefined and discovered activity patterns

Journal of Ambient Intelligence and Smart Environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering s a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort ofevaluation as regards its validity. In this paper we present a clustering validity procedure,which evaluates the results of clustering algorithms on data sets. We define a validity index, S_Dbw, based on well-defined clustering criteria enabling the selection of the optimal input parameters' values for a clustering algorithm that result in the best partitioning of a data set.We evaluate the reliability of our index both theoretically and experimentally, considering three representative clustering algorithms ran on synthetic and real data sets. Also, we carried out an evaluation study to compare S_Dbw performance with other known validity indices.Our approach performed favorably in all cases, even in those that other indices failed to indicate the correct partitions in a data set.