Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis

Authors:
Indrajit Saha;Ujjwal Maulik;Sanghamitra Bandyopadhyay;Dariusz Plewczynski
Affiliations:
Interdisciplinary Centre for Mathematical and Computational Modeling (ICM), University of Warsaw, 02-106 Warsaw, Poland;Department of Computer Science and Engineering, Jadavpur University, Kolkata 700 032, West Bengal, India;The Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, West Bengal, India;Interdisciplinary Centre for Mathematical and Computational Modeling (ICM), University of Warsaw, 02-106 Warsaw, Poland
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 21
Cited 1

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Algorithms for clustering data

Algorithms for clustering data
Unsupervised Optimal Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces

Journal of Global Optimization
Performance Evaluation of Some Clustering Algorithms and Validity Indices

IEEE Transactions on Pattern Analysis and Machine Intelligence
A new cluster validity measure and its application to image compression

Pattern Analysis & Applications
New indices for cluster validity assessment

Pattern Recognition Letters
A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Pattern Recognition
On fuzzy cluster validity indices

Fuzzy Sets and Systems
Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery

Pattern Recognition
Analysis of Biological Data: A Soft Computing Approach - Vol. 3

Analysis of Biological Data: A Soft Computing Approach - Vol. 3
An efficient technique for superfamily classification of amino acid sequences: feature extraction, fuzzy clustering and prototype selection

Fuzzy Sets and Systems
Integrating clustering and supervised learning for categorical data analysis

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Nonparametric genetic clustering: comparison of validity indices

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA

IEEE Transactions on Evolutionary Computation
Some new indexes of cluster validity

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A New Convergence Proof of Fuzzy c-Means

IEEE Transactions on Fuzzy Systems
On cluster validity for the fuzzy c-means model

IEEE Transactions on Fuzzy Systems

A fuzzy intelligent approach to the classification problem in gene expression data analysis

Knowledge-Based Systems

Quantified Score

Hi-index	12.05

Visualization

Abstract

In recent year, the problem of clustering in microarray data has been gaining significant attention. However most of the clustering methods attempt to find the group of genes where the number of cluster is known a priori. This fact motivated us to develop a new real-coded improved differential evolution based automatic fuzzy clustering algorithm which automatically evolves the number of clusters as well as the proper partitioning of a gene expression data set. To improve the result further, the clustering method is integrated with a support vector machine, a well-known technique for supervised learning. A fraction of the gene expression data points selected from different clusters based on their proximity to the respective centers, is used for training the SVM. The clustering assignments of the remaining gene expression data points are thereafter determined using the trained classifier. The performance of the proposed clustering technique has been demonstrated on five gene expression data sets by comparing it with the differential evolution based automatic fuzzy clustering, variable length genetic algorithm based fuzzy clustering and well known Fuzzy C-Means algorithm. Statistical significance test has been carried out to establish the statistical superiority of the proposed clustering approach. Biological significance test has also been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of genes. The processed data sets and the matlab version of the software are available at http://bio.icm.edu.pl/~darman/IDEAFC-SVM/.