A new separation measure for improving the effectiveness of validity indices

Authors:
Shihong Yue;Jeen-Shing Wang;Teresa Wu;Huaxiang Wang
Affiliations:
School of Electrical Engineering and Automation, Tianjin Key Laboratory of Process Measurement and Control, Tianjin University, Tianjin 300072, China;Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Taiwan;School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85287, USA;School of Electrical Engineering and Automation, Tianjin Key Laboratory of Process Measurement and Control, Tianjin University, Tianjin 300072, China
Venue:
Information Sciences: an International Journal
Year:
2010

Citing 26
Cited 5

A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
A Fuzzy-Syntactic Approach to Allograph Modeling for Cursive Script Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Performance Evaluation of Some Clustering Algorithms and Validity Indices

IEEE Transactions on Pattern Analysis and Machine Intelligence
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Stability-based validation of clustering solutions

Neural Computation
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Subspace Clustering of High Dimensional Data

Data Mining and Knowledge Discovery
New indices for cluster validity assessment

Pattern Recognition Letters
Unsupervised minor prototype detection using an adaptive population partitioning algorithm

Pattern Recognition
A cluster validity measure with a hybrid parameter search method for the support vector clustering algorithm

Pattern Recognition
A cluster validity index for fuzzy clustering

Information Sciences: an International Journal
A clustering method to identify representative financial ratios

Information Sciences: an International Journal
A general grid-clustering approach

Pattern Recognition Letters
Clustering algorithm for intuitionistic fuzzy sets

Information Sciences: an International Journal
Clustering high dimensional data: A graph-based relaxed optimization approach

Information Sciences: an International Journal
SubSpace Projection: A unified framework for a class of partition-based dimension reduction techniques

Information Sciences: an International Journal
A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters

Information Sciences: an International Journal
A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification

Fuzzy Sets and Systems
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence
Some new indexes of cluster validity

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A weighted sum validity function for clustering with a hybrid niching genetic algorithm

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Cluster Validity Measure With Outlier Detection for Support Vector Clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hybrid fuzzy-neural systems in handwritten word recognition

IEEE Transactions on Fuzzy Systems
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Clustering web people search results using fuzzy ants

Information Sciences: an International Journal
A dynamic classifier ensemble selection approach for noise data

Information Sciences: an International Journal
A nonparametric classification method based on K-associated graphs

Information Sciences: an International Journal
A theoretical and empirical study on unbiased boundary-extended crossover for real-valued representation

Information Sciences: an International Journal
MiniMax ε-stable cluster validity index for Type-2 fuzziness

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

Many validity indices have been proposed for quantitatively assessing the performance of clustering algorithms. One limitation of existing indices is their lack of generalizability, due to their dependence on the specific algorithms and structures of the data space. To handle large-scale datasets with arbitrary structures, this research study proposes a new cluster separation measure for improving the effectiveness of existing validity indices. This is achieved by partitioning the original data space into a grid-based structure which allows the introduction of a new measurement for assessing the true data distribution between any two clusters instead of the distance between the two cluster prototypes. To validate the effectiveness of the proposed separation measure, we adopt two commonly used validity indices, the Davies-Bouldin's function (DB) and Tibshirani's Gap statistic (GS). These indices are denoted as R-DB-1 and R-GS-1 for clusters with sphere-shaped structures and R-DB-2 and R-GS-2 for irregular-shaped structures. This integration enables the indices to evaluate both partitional algorithms and hierarchical algorithms. Partitional algorithms including C-Means (CM), Fuzzy C-Means (FCM), and hierarchical algorithms, including DBSCAN and CLIQUE, are used to test the performance of the new indices. Two synthetic datasets with spherical structures and four synthetic datasets with irregular shapes are first compared. Five real datasets from the UCI machine learning repository are then used to further test the measure's performance. The experimental results provide evidence that the new indices outperform the original indices.