Feature shaping for linear SVM classifiers

  • Authors:
  • George Forman;Martin Scholz;Shyamsundar Rajaram

  • Affiliations:
  • Hewlett-Packard Labs, Palo Alto, CA, USA;Hewlett-Packard Labs, Palo Alto, CA, USA;Hewlett-Packard Labs, Palo Alto, CA, USA

  • Venue:
  • Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linear classifiers have been shown to be effective for many discrimination tasks. Irrespective of the learning algorithm itself, the final classifier has a weight to multiply by each feature. This suggests that ideally each input feature should be linearly correlated with the target variable (or anti-correlated), whereas raw features may be highly non-linear. In this paper, we attempt to re-shape each input feature so that it is appropriate to use with a linear weight and to scale the different features in proportion to their predictive value. We demonstrate that this pre-processing is beneficial for linear SVM classifiers on a large benchmark of text classification tasks as well as UCI datasets.