Robustness analysis of eleven linear classifiers in extremely high–dimensional feature spaces

  • Authors:
  • Ludwig Lausser;Hans A. Kestler

  • Affiliations:
  • Department of Internal Medicine I, University Hospital Ulm, Germany;Department of Internal Medicine I, University Hospital Ulm, Germany

  • Venue:
  • ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this study we address the linear classification of noisy high-dimensional data in a two class scenario. We assume that the cardinality of the data is much lower than its dimensionality. The problem of classification in this setting is intensified in the presence of noise. Eleven linear classifiers were compared on two-thousand-one-hundred-and-fifty artificial datasets from four different experimental setups, and five real world gene expression profile datasets, in terms of classification accuracy and robustness. We specifically focus on linear classifiers as the use of more complex concept classes would make over-adaptation even more likely. Classification accuracy is measured by mean error rate and mean rank of error rate. These criteria place two large margin classifiers, SVM and ALMA, and an online classification algorithm called PA at the top, with PA being statistically different from SVM on the artificial data. Surprisingly, these algorithms also outperformed statistically significant all classifiers investigated with dimensionality reduction.