Relevance, redundancy and differential prioritization in feature selection for multiclass gene expression data

  • Authors:
  • Chia Huey Ooi;Madhu Chetty;Shyh Wei Teng

  • Affiliations:
  • Gippsland School of Information Technology, Monash University, Churchill, Australia;Gippsland School of Information Technology, Monash University, Churchill, Australia;Gippsland School of Information Technology, Monash University, Churchill, Australia

  • Venue:
  • ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The large number of genes in microarray data makes feature selection techniques more crucial than ever. From various ranking-based filter procedures to classifier-based wrapper techniques, many studies have devised their own flavor of feature selection techniques. Only a handful of the studies delved into the effect of redundancy in the predictor set on classification accuracy, and even fewer on the effect of varying the importance between relevance and redundancy. We present a filter-based feature selection technique which incorporates the three elements of relevance, redundancy and differential prioritization. With the aid of differential prioritization, our feature selection technique is capable of achieving better accuracies than those of previous studies, while using fewer genes in the predictor set. At the same time, the pitfalls of over-optimistic estimates of accuracy are avoided through the use of a more realistic evaluation procedure than the internal leave-one-out-cross-validation.