An efficient greedy K-means algorithm for global gene trajectory clustering

  • Authors:
  • Zeke S. H. Chan;Lesley Collins;N. Kasabov

  • Affiliations:
  • Knowledge Engineering and Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand;Allan Wilson Center for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand;Knowledge Engineering and Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2006

Quantified Score

Hi-index 12.06

Visualization

Abstract

Optimal clustering of co-regulated genes is critical for reliable inference of the underlying biological processes in gene expression analysis, for which the K-means algorithm have been widely employed for its efficiency. However, given that the solution space is large and multimodal, which is typical of gene expression data, K-means is prone to produce inconsistent and sub-optimal cluster solutions that may be unreliable and misleading for biological interpretation. This paper applies a novel global clustering method called the greedy elimination method (GEM) to alleviate these problems. GEM is simple to implement, yet very effective in improving the global optimality of the solutions. Experiments over two sets of gene expression data show that the GEM scores significantly lower clustering errors than the standard K-means and the greedy incremental method.