Towards improving fuzzy clustering using support vector machine: Application to gene expression data

Authors:
Anirban Mukhopadhyay;Ujjwal Maulik
Affiliations:
Department of Computer Science and Engineering, University of Kalyani, Kalyani 741235, India;Department of Computer Science and Engineering, Jadavpur University, Kolkata 700032, India
Venue:
Pattern Recognition
Year:
2009

Citing 15
Cited 5

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Algorithms for clustering data

Algorithms for clustering data
A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern classification using genetic algorithm: determination of H

Pattern Recognition Letters
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
On the algorithmic implementation of multiclass kernel-based vector machines

The Journal of Machine Learning Research
Simulated Annealing Using a Reversible Jump Markov Chain Monte Carlo Algorithm for Fuzzy Clustering

IEEE Transactions on Knowledge and Data Engineering
Clustering microarray gene expression data using weighted Chinese restaurant process

Bioinformatics
An improved algorithm for clustering gene expression data

Bioinformatics
Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery

Pattern Recognition
Analysis of Biological Data: A Soft Computing Approach - Vol. 3

Analysis of Biological Data: A Soft Computing Approach - Vol. 3
An efficient technique for superfamily classification of amino acid sequences: feature extraction, fuzzy clustering and prototype selection

Fuzzy Sets and Systems
A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA

IEEE Transactions on Evolutionary Computation
A New Convergence Proof of Fuzzy c-Means

IEEE Transactions on Fuzzy Systems

An online core vector machine with adaptive MEB adjustment

Pattern Recognition
Clustering of temporal gene expression data by regularized spline regression and an energy based similarity measure

Pattern Recognition
Discovering the transcriptional modules using microarray data by penalized matrix decomposition

Computers in Biology and Medicine
A fuzzy intelligent approach to the classification problem in gene expression data analysis

Knowledge-Based Systems
Inferring transcriptional modules from microarray and ChIP-chip data using penalized matrix decomposition

ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories and Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

Recent advancement in microarray technology permits monitoring of the expression levels of a large set of genes across a number of time points simultaneously. For extracting knowledge from such huge volume of microarray gene expression data, computational analysis is required. Clustering is one of the important data mining tools for analyzing such microarray data to group similar genes into clusters. Researchers have proposed a number of clustering algorithms in this purpose. In this article, an attempt has been made in order to improve the performance of fuzzy clustering by combining it with support vector machine (SVM) classifier. A recently proposed real-coded variable string length genetic algorithm based clustering technique and an iterated version of fuzzy C-means clustering have been utilized in this purpose. The performance of the proposed clustering scheme has been compared with that of some well-known existing clustering algorithms and their SVM boosted versions for one simulated and six real life gene expression data sets. Statistical significance test based on analysis of variance (ANOVA) followed by posteriori Tukey-Kramer multiple comparison test has been conducted to establish the statistical significance of the superior performance of the proposed clustering scheme. Moreover biological significance of the clustering solutions have been established.