Clustering Time-Series Gene Expression Data with Unequal Time Intervals

  • Authors:
  • Luis Rueda;Ataul Bari;Alioune Ngom

  • Affiliations:
  • School of Computer Science, University of Windsor, Windsor, Canada N9B 3P4;School of Computer Science, University of Windsor, Windsor, Canada N9B 3P4;School of Computer Science, University of Windsor, Windsor, Canada N9B 3P4

  • Venue:
  • Transactions on Computational Systems Biology X
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints, namely exchanging two or more time points is not possible as it would deliver quite different results, and also it would lead to erroneous biological conclusions. We have focused on issues related to clustering gene expression temporal profiles, and devised a novel algorithm for clustering gene temporal expression profile microarray data. The proposed clustering method introduces the concept of profile alignment which is achieved by minimizing the area between two aligned profiles. The overall pattern of expression in the time-series context is accomplished by applying agglomerative clustering combined with profile alignment, and finding the optimal number of clusters by means of a variant of a clustering index, which can effectively decide upon the optimal number of clusters for a given dataset. The effectiveness of the proposed approach is demonstrated on two well-known datasets, yeast and serum, and corroborated with a set of pre-clustered yeast genes, which show a very high classification accuracy of the proposed method, though it is an unsupervised scheme.