Applications of an enhanced cluster validity index method based on the Fuzzy C-means and rough set theories to partition and classification

Authors:
Kuang Yu Huang
Affiliations:
Department of Information Management, Ling Tung University, #1 Ling Tung Road, Taichung City 408, Taiwan
Venue:
Expert Systems with Applications: An International Journal
Year:
2010

Citing 21
Cited 6

On the meaning of Dunn's partition coefficient for fuzzy clusters

Fuzzy Sets and Systems
Rough sets: probabilistic versus deterministic approach

International Journal of Man-Machine Studies
Unsupervised Optimal Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
The nature of statistical learning theory

The nature of statistical learning theory
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
A new cluster validity index for the fuzzy c-mean

Pattern Recognition Letters
Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory

Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory
The CN2 Induction Algorithm

Machine Learning
Induction of Decision Trees

Machine Learning
Applying rough sets to market timing decisions

Decision Support Systems - Special issue: Data mining for financial decision making
A new approach for measuring the validity of the fuzzy c-means algorithm

Advances in Engineering Software
Algorithmic Learning in a Random World

Algorithmic Learning in a Random World
An illustration of variable precision rough sets model: an analysis of the findings of the UK Monopolies and Mergers Commission

Computers and Operations Research
A cluster validity index for fuzzy clustering

Pattern Recognition Letters
A comparison of fuzzy strategies for corporate acquisition analysis

Fuzzy Sets and Systems
On fuzzy cluster validity indices

Fuzzy Sets and Systems
A hybrid model for stock market forecasting and portfolio selection based on ARX, grey system and RS theories

Expert Systems with Applications: An International Journal
A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification

Fuzzy Sets and Systems
On cluster validity for the fuzzy c-means model

IEEE Transactions on Fuzzy Systems

A hybrid particle swarm optimization approach for clustering and classification of datasets

Knowledge-Based Systems
An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data

Knowledge-Based Systems
Application of enhanced cluster validity index function to automatic stock portfolio selection system

Information Technology and Management
An enhanced classification method comprising a genetic algorithm, rough set theory and a modified PBMF-index function

Applied Soft Computing
A hybrid approach to continuous valued datasets classifying based on particle swarm optimization, variable precision rough set theory and modified huang-index function

WSEAS Transactions on Information Science and Applications
Feature selection based on cluster and variability analyses for ordinal multi-class classification problems

Knowledge-Based Systems

Quantified Score

Hi-index	12.07

Visualization

Abstract

This study proposes a method of cluster validity index that simultaneously provide the measurements of goodness of clustering on clustered data and of classification accuracy for complicated information systems based upon the PBMF-index method and rough set (RS) theory. The maximum value of this index, called the Huang-index, not only provides the best partitioning, but also obtains the optimal accuracy of classification for the approximation sets. The traditional PBMF-index method is only used to ensure the formation of a small number of compact clusters with large separation between at least two clusters. In contrast to the traditional PBMF-index method, the Huang-index method extends the applications of unsupervised optimal cluster to the fields of classification. In the proposed algorithm, all the attributes of the data are first clustered into groups using the Fuzzy C-means (FCM) method. The clustered data are then used to identify approximate regions and classification accuracy and to calculate centroids of clusters for decision attribute based on the RS theory. Finally, all those calculated data are put into the proposed index method to find the cluster validity index. The validity of the proposed approach is demonstrated using the data derived from a hypothetical function of two independent variables and electronic stock data extracted from the financial database maintained by the Taiwan Economic Journal (TEJ). The clustering results obtained using the proposed method are compared with the results obtained using the traditional PBMF-index partition method. The effects of the number of clusters on the partitions of clusters and the RS regions are systematically examined and compared. The results show that the proposed Huang-index method not only yields a superior clustering capability than the traditional clustering algorithm, but also yields a reliable classification and obtains a set of suitable decision rules extracted from the RS theory.