Relevance, redundancy and differential prioritization in feature selection for multiclass gene expression data

Authors:
Chia Huey Ooi;Madhu Chetty;Shyh Wei Teng
Affiliations:
Gippsland School of Information Technology, Monash University, Churchill, Australia;Gippsland School of Information Technology, Monash University, Churchill, Australia;Gippsland School of Information Technology, Monash University, Churchill, Australia
Venue:
ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis
Year:
2005

Citing 5
Cited 4

Minimum Redundancy Feature Selection from Microarray Gene Expression Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
An introduction to variable and feature selection

The Journal of Machine Learning Research
Efficiently handling feature redundancy in high-dimensional data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression

Bioinformatics
The 'subsequent artificial neural network' (SANN) approach might bring more classificatory power to ANN-based DNA microarray analyses

Bioinformatics

Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

Data Mining and Knowledge Discovery
A Study on the Importance of Differential Prioritization in Feature Selection Using Toy Datasets

PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
OVA scheme vs. single machine approach in feature selection for microarray datasets

ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining
Investigating the class-specific relevance of predictor sets obtained from DDP-Based feature selection technique

PRIB'06 Proceedings of the 2006 international conference on Pattern Recognition in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The large number of genes in microarray data makes feature selection techniques more crucial than ever. From various ranking-based filter procedures to classifier-based wrapper techniques, many studies have devised their own flavor of feature selection techniques. Only a handful of the studies delved into the effect of redundancy in the predictor set on classification accuracy, and even fewer on the effect of varying the importance between relevance and redundancy. We present a filter-based feature selection technique which incorporates the three elements of relevance, redundancy and differential prioritization. With the aid of differential prioritization, our feature selection technique is capable of achieving better accuracies than those of previous studies, while using fewer genes in the predictor set. At the same time, the pitfalls of over-optimistic estimates of accuracy are avoided through the use of a more realistic evaluation procedure than the internal leave-one-out-cross-validation.