Cluster analysis of gene expression data based on self-splitting and merging competitive learning

Authors:
Shuanhu Wu;A. W.-C. Liew;Hong Yan;Mengsu Yang
Affiliations:
Dept. of Comput. Eng. & Inf. Technol., City Univ. of Hong Kong, China;-;-;-
Venue:
IEEE Transactions on Information Technology in Biomedicine
Year:
2004

Citing 0
Cited 11

MMR: An algorithm for clustering categorical data using Rough Set Theory

Data & Knowledge Engineering
Spectral similarity for analysis of DNA microarray time-series data

International Journal of Data Mining and Bioinformatics
Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Pattern recognition techniques for the emerging field of bioinformatics: A review

Pattern Recognition
A rough set approach for selecting clustering attribute

Knowledge-Based Systems
Multiple self-splitting and merging competitive learning algorithm

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Gene clustering by using query-based self-organizing maps

Expert Systems with Applications: An International Journal
Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs

Applied Intelligence
Efficient matching and retrieval of gene expression time series data based on spectral information

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
OPTOC-based clustering analysis of gene expression profiles in spectral space

ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part III
MAR: Maximum Attribute Relative of soft set for clustering attribute selection

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene expression changes during yeast cell cycle, for which the fundamental patterns of gene expression and assignment of genes to clusters are well understood from numerous previous studies. Comparative studies with several clustering algorithms illustrate the effectiveness of our method.