Multi-objective genetic algorithm based clustering approach and its application to gene expression data

Authors:
Tansel Özyer;Yimin Liu;Reda Alhajj;Ken Barker
Affiliations:
Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
Venue:
ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Year:
2004

Citing 10
Cited 4

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Self-organizing maps

Self-organizing maps
Data clustering: a review

ACM Computing Surveys (CSUR)
Context-specific Bayesian clustering for gene expression data

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Clustering Algorithms

Clustering Algorithms
Techniques of Cluster Algorithms in Data Mining

Data Mining and Knowledge Discovery
Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II

PPSN VI Proceedings of the 6th International Conference on Parallel Problem Solving from Nature
Quality Scheme Assessment in the Clustering Process

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
FGKA: a Fast Genetic K-means Clustering Algorithm

Proceedings of the 2004 ACM symposium on Applied computing

Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer

Applied Intelligence
Discovering cancer biomarkers: from DNA to communities of genes

International Journal of Networking and Virtual Organisations
Effective clustering by iterative approach

ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Reporting and analyzing alternative clustering solutions by employing multi-objective genetic algorithm and conducting experiments on cancer data

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Gene clustering is a common methodology for analyzing similar data based on expression trajectories. Clustering algorithms in general need the number of clusters as a priori, and this is mostly hard to estimate, even by domain experts. In this paper, we use Niched Pareto k-means Genetic Algorithm (GA) for clustering m-RNA data. After running the multi-objective GA, we get the pareto-optimal front that gives alternatives for the optimal number of clusters as a solution set. We analyze the clustering results under two cluster validity techniques commonly cited in the literature, namely DB index and SD index. This gives an idea about ranking the optimal numbers of clusters for each validity index. We tested the proposed clustering approach by conducting experiments using three data sets, namely figure2data, cancer (NCI60) and Leukaemia data. The obtained results are promising; they demonstrate the applicability and effectiveness of the proposed approach.