A one-dimensional analysis for the probability of error of linear classifiers for normally distributed classes

Authors:
Luis Rueda
Affiliations:
School of Computer Science, University of Windsor, 401 Sunset Avenue, Windsor, Ont., N9B 3P4, Canada
Venue:
Pattern Recognition
Year:
2005

Citing 6
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Algorithm 715: SPECFUN–a portable FORTRAN package of special function routines and test drivers

ACM Transactions on Mathematical Software (TOMS)
Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix

Pattern Recognition Letters
AI Game Programming Wisdom

AI Game Programming Wisdom
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
A PAC-Bayesian margin bound for linear classifiers

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.01

Visualization

Abstract

Computing the probability of error is an important problem in evaluating classifiers. When dealing with normally distributed classes, this problem becomes intricate due to the fact that there is no closed-form expression for integrating the probability density function. In this paper, we derive lower and upper bounds for the probability of error for a linear classifier, where the random vectors representing the underlying classes obey the multivariate normal distribution. The expression of the error is derived in the one-dimensional space, independently of the dimensionality of the original problem. Based on the two bounds, we propose an approximating expression for the error of a generic linear classifier. In particular, we derive the corresponding bounds and the expression for approximating the error of Fisher's classifier. Our empirical results on synthetic data, including up to two-hundred-dimensional featured samples, show that the computations for the error are extremely fast and quite accurate; it differs from the actual error in at most @e=0.0184340683. The scheme has also been successfully tested on real-life data sets drawn from the UCI machine learning repository.