Rule Induction for Classification of Gene Expression Array Data

  • Authors:
  • Per Lidén;Lars Asker;Henrik Boström

  • Affiliations:
  • -;-;-

  • Venue:
  • PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Gene expression array technology has rapidly become a standard tool for biologists. Its use within areas such as diagnostics, toxicology, and genetics, calls for good methods for finding patterns and prediction models from the generated data. Rule induction is one promising candidate method due to several attractive properties such as high level of expressiveness and interpretability. In this work we investigate the use of rule induction methods for mining gene expression patterns from various cancer types. Three different rule induction methods are evalu-tedoon two public tumor tissue data sets. The methods are shown to obtain as good prediction accuracy as the best current methods, at the same time allowing for straightforward interpretation of the prediction models. These models typically consist of small sets of simple rules, which associate a few genes and expression levels with specific types of cancer. We also show that information gain is a useful measure for ranked feature selection in this domain.