On Dimensionality, Sample Size, and Classification Error of Nonparametric Linear Classification Algorithms

Authors:
Š/a&rmacr/nas Raudys
Affiliations:
Institute of Mathematics and Informatics, Vilnius, Lithuania
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1997

Citing 1
Cited 22

Support-Vector Networks

Machine Learning

On Optimal Pairwise Linear Classifiers for Normal Distributions: The Two-Dimensional Case

IEEE Transactions on Pattern Analysis and Machine Intelligence
Randomized Algorithms: A System-Level, Poly-Time Analysis of Robust Computation

IEEE Transactions on Computers
An approach to the evaluation of the performance of a discrete classifier

Pattern Recognition Letters
The Foundational Theory of Optimal Bayesian Pairwise Linear Classifiers

Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Complexity of Classification Problems and Comparative Advantages of Combined Classifiers

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
The d-Dimensional Normal Distribution Case

AI '01 Proceedings of the 14th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
The influence of prior knowledge on the expected performance of a classifier

Pattern Recognition Letters
Selecting the best hyperplane in the framework of optimal pairwise linear classifiers

Pattern Recognition Letters
Results in statistical discriminant analysis: a review of the former Soviet union literature

Journal of Multivariate Analysis
A tree-based decision rule for identifying profile groups of cases without predefined classes: application in diffuse large B-cell lymphomas

Computers in Biology and Medicine
On the Bayes fusion of visual features

Image and Vision Computing
2008 Special Issue: Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance

Neural Networks
Classification tree based protein structure distances for testing sequence-structure correlation

Computers in Biology and Medicine
eigenPulse: Robust human identification from cardiovascular function

Pattern Recognition
Evaluating classifiers: relation between area under the receiver operator characteristic curve and overall accuracy

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Forest classification trees and forest support vector machines algorithms: Demonstration using microarray data

Computers in Biology and Medicine
Invariant operators, small samples, and the bias-variance dilemma

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Virtual sample generation using concurrent-self-organizing maps and its application for facial expression recognition

MMES'10 Proceedings of the 2010 international conference on Mathematical models for engineering science
A theoretical comparison of two linear dimensionality reduction techniques

CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Alternative approaches and algorithms for classification

ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part II
On the performance of chernoff-distance-based linear dimensionality reduction techniques

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Learning algorithms may perform worse with increasing training set size: Algorithm-data incompatibility

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.15

Visualization

Abstract

This paper compares two nonparametric linear classification algorithms驴the zero empirical error classifier and the maximum margin classifier驴with parametric linear classifiers designed to classify multivariate Gaussian populations [7]. Formulae and a table for the mean expected probability of misclassification MEPN are presented. They show that the classification error is mainly determined by N驴/驴p, a learning-set size/dimensionality ratio. However, the influences of learning-set size on the generalization error of parametric and nonparametric linear classifiers are quite different. Under certain conditions the nonparametric approach allows us to obtain reliable rules, even in cases where the number of features is larger than the number of training vectors.