Short communication: Diagnosis of bladder cancers with small sample size via feature selection

  • Authors:
  • T. Warren Liao

  • Affiliations:
  • Department of Construction Management and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.06

Visualization

Abstract

This paper proposes feature selection as an approach to deal with a bladder cancer data set with small sample size. Three feature selection methods and four classifiers were used to determine the best feature subsets that produce perfect classification accuracy. The smallest best feature subsets are used to build neural models with the small data set to achieve 100% training and testing accuracies. Therefore, the mega-trend-diffusion technique proposed by Li et al. to produce artificial samples is actually unnecessary. The similarity classifier proposed by Luukka was also applied to the small data set with the smallest best feature subsets to achieve 100% accuracy using only 4 samples (two with bladder cancer and two normal) for two selected p and m values. Given the same accuracy, using the best feature subsets selected is better than using all 13 features as done by Luukka. Furthermore, several indexes/methods commonly used in filtering feature selection methods were tested for their ability to find the best feature subsets for this particular small bladder cancer data set.