Semantic mining and analysis of gene expression data

  • Authors:
  • Xin Xu;Gao Cong;Beng Chin Ooi;Kian-Lee Tan;Anthony K. H. Tung

  • Affiliations:
  • School of Computing, National University of Singapore, Singapore;School of Computing, National University of Singapore, Singapore;School of Computing, National University of Singapore, Singapore;School of Computing, National University of Singapore, Singapore;School of Computing, National University of Singapore, Singapore

  • Venue:
  • VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Association rules can reveal biological relevant relationship between genes and environments / categories. However, most existing association rule mining algorithms are rendered impractical on gene expression data, which typically contains thousands or tens of thousands of columns (gene expression levels), but only tens of rows (samples). The main problem is that these algorithms have an exponential dependence on the number of columns. Another shortcoming is evident that too many associations are generated from such kind of data. To this end, we have developed a novel depth-first row-wise algorithm FARMER [2] that is specially designed to efficiently discover and cluster association rules into interesting rule groups (IRGs) that satisfy user-specified minimum support, confidence and chi-square value thresholds on biological datasets as opposed to finding association rules individually. Based on FARMER, we have developed a prototype system that integrates semantic mining and visual analysis of IRGs mined from gene expression data.