Cluster analysis on time series gene expression data

  • Authors:
  • Huang-Cheng Kuo;Tsung-Lung Lee;Jen-Peng Huang

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Chiayi University, Chia-Yi City 600, Taiwan.;Department of Computer Science and Information Engineering, National Chiayi University, Chia-Yi City 600, Taiwan.;Department of Information Management, Southern Taiwan University, Tainan County 710, Taiwan

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster analysis is frequently used to study the trend of gene expression behaviours from microarray time series data. We adopt a partitioning-based clustering algorithm for such a task. After time series are discritised into sequences, a sequential pattern mining technique is applied to find patterns as the initial clusters. Longest Common Subseries Similarity is used to measure the similarity between time series which overcomes the 'shift-effect' influence. An object is re-assigned to the cluster which has most objects within the k nearest neighbours of the object. Similarity measurements, like Pearson correlation coefficient, are used to determine the neighbours.