Detecting clusters of different geometrical shapes in microarray gene expression data

Authors:
Dae-Won Kim;Kwang H. Lee;Doheon Lee
Affiliations:
Department of BioSystems and Advanced Information Technology Research Center, Korea Advanced Institute of Science and Technology 373--1 Guseong-dong, Yuseong-gu, Daejeon, 305--701, Korea;Department of BioSystems and Advanced Information Technology Research Center, Korea Advanced Institute of Science and Technology 373--1 Guseong-dong, Yuseong-gu, Daejeon, 305--701, Korea;Department of BioSystems and Advanced Information Technology Research Center, Korea Advanced Institute of Science and Technology 373--1 Guseong-dong, Yuseong-gu, Daejeon, 305--701, Korea
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 8

Techniques for clustering gene expression data

Computers in Biology and Medicine
Identification of temporal association rules from time-series microarray data set: temporal association rules

Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Interval based fuzzy systems for identification of important genes from microarray gene expression data: Application to carcinogenic development

Journal of Biomedical Informatics
Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values

Journal of Biomedical Informatics
Neuro-fuzzy methodology for selecting genes mediating lung cancer

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
Iterative clustering analysis for grouping missing data in gene expression profiles

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Automatic segmentation of dermoscopy images using self-generating neural networks seeded by genetic algorithm

Pattern Recognition
Selection of genes mediating certain cancers, using a neuro-fuzzy approach

Neurocomputing

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Many clustering methods have been proposed for clustering gene-expression data, including the hierarchical clustering, k-means clustering and self-organizing map (SOM). However, the conventional methods are limited to identify different shapes of clusters because they use a fixed distance norm when calculating the distance between genes. The fixed distance norm imposes a fixed geometrical shape on the clusters regardless of the actual data distribution. Thus, different distance norms are required for handling the different shapes of clusters. Results: We present the Gustafson--Kessel (GK) clustering method for microarray gene-expression data. To detect clusters of different shapes in a dataset, we use an adaptive distance norm that is calculated by a fuzzy covariance matrix (F) of each cluster in which the eigenstructure of F is used as an indicator of the shape of the cluster. Moreover, the GK method is less prone to falling into local minima than the k-means and SOM because it makes decisions through the use of membership degrees of a gene to clusters. The algorithmic procedure is accomplished by the alternating optimization technique, which iteratively improves a sequence of sets of clusters until no further improvement is possible. To test the performance of the GK method, we applied the GK method and well-known conventional methods to three recently published yeast datasets, and compared the performance of each method using the Saccharomyces Genome Database annotations. The clustering results of the GK method are more significantly relevant to the biological annotations than those of the other methods, demonstrating its effectiveness and potential for clustering gene-expression data. Availability: The software was developed using Java language, and can be executed on the platforms that JVM (Java Virtual Machine) is running. It is available from the authors upon request. Contact: dhlee@bisl.kaist.ac.kr Supplementary information: Supplementary data are available at http://dragon.kaist.ac.kr/gk