Order preserving clustering by finding frequent orders in gene expression data

  • Authors:
  • Li Teng;Laiwan Chan

  • Affiliations:
  • Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong

  • Venue:
  • PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper concerns the discovery of Order Preserving Clusters (OP-Clusters) in gene expression data, in each of which a subset of genes induce a similar linear ordering along a subset of conditions. After converting each gene vector into an ordered label sequence. The problem is transferred into finding frequent orders appearing in the sequence set. We propose an algorithm of finding the frequent orders by iteratively Combining the most Frequent Prefixes and Suffixes (CFPS) in a statistical way. We also define the significance of an OP-Cluster. Our method has good scale-up property with dimension of the dataset and size of the cluster. Experimental study on both synthetic datasets and real gene expression dataset shows our approach is very effective and efficient.