Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study

Authors:
Salvador Garcia;Joaquin Derrac;Jose Cano;Francisco Herrera
Affiliations:
University of Jaen, Jaen;CITIC-UGR (Research Center on Information and Communications Technology), Granada;University of Jaen, Jaen;CITIC-UGR (Research Center on Information and Communications Technology), Granada
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2012

Citing 0
Cited 34

Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection

Information Sciences: an International Journal
On-demand numerosity reduction for object learning

Proceedings of the workshop on Internet of Things and Service Platforms
An adaptive hybrid and cluster-based model for speeding up the k-NN classifier

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
A co-evolutionary framework for nearest neighbor enhancement: combining instance and feature weighting with instance selection

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
A simple noise-tolerant abstraction algorithm for fast k-NN classification

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Integrating a differential evolution feature weighting scheme into prototype generation

Neurocomputing
On the use of data filtering techniques for credit risk prediction with instance-based models

Expert Systems with Applications: An International Journal
Optimal "Anti-Bayesian" parametric pattern classification for the exponential family using order statistics criteria

ICIAR'12 Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part I
Multi-selection of instances: A straightforward way to improve evolutionary instance selection

Applied Soft Computing
InstanceRank based on borders for instance selection

Pattern Recognition
The fundamental theory of optimal "Anti-Bayesian" parametric pattern classification using order statistics criteria

Pattern Recognition
Efficient dataset size reduction by finding homogeneous clusters

Proceedings of the Fifth Balkan Conference in Informatics
Instance selection with neural networks for regression problems

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Feature ranking methods used for selection of prototypes

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Genetic algorithms in feature and instance selection

Knowledge-Based Systems
A scalable approach to simultaneous evolutionary instance and feature selection

Information Sciences: an International Journal
Salience-Based prototype selection for k-nearest neighbor classification in multiple-instance learning

IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
A new probabilistic active sample selection algorithm for class imbalance problem

International Journal of Knowledge Engineering and Soft Data Paradigms
Evolutionary computation for supervised learning

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
FRPS: A Fuzzy Rough Prototype Selection method

Pattern Recognition
AIB2: an abstraction data reduction technique based on IB2

Proceedings of the 6th Balkan Conference in Informatics
Dynamic classifier selection for One-vs-One strategy: Avoiding non-competent classifiers

Pattern Recognition
EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling

Pattern Recognition
ATISA: Adaptive Threshold-based Instance Selection Algorithm

Expert Systems with Applications: An International Journal
"Anti-Bayesian" parametric pattern classification using order statistics criteria for some members of the exponential family

Pattern Recognition
A study on the application of instance selection techniques in genetic fuzzy rule-based classification systems: Accuracy-complexity trade-off

Knowledge-Based Systems
Addressing imbalanced classification with instance generation techniques: IPADE-ID

Neurocomputing
A novel prototype generation technique for handwriting digit recognition

Pattern Recognition
Prototype reduction based on Direct Weighted Pruning

Pattern Recognition Letters
Fuzzy nearest neighbor algorithms: Taxonomy, experimental analysis and prospects

Information Sciences: an International Journal
Evolutionary instance selection for text classification

Journal of Systems and Software
On the use of meta-learning for instance selection: An architecture and an experimental study

Information Sciences: an International Journal
A fast prototype reduction method based on template reduction and visualization-induced self-organizing map for nearest neighbor algorithm

Applied Intelligence
On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification

Neurocomputing

Quantified Score

Hi-index	0.15

Visualization

Abstract

The nearest neighbor classifier is one of the most used and well-known techniques for performing recognition tasks. It has also demonstrated itself to be one of the most useful algorithms in data mining in spite of its simplicity. However, the nearest neighbor classifier suffers from several drawbacks such as high storage requirements, low efficiency in classification response, and low noise tolerance. These weaknesses have been the subject of study for many researchers and many solutions have been proposed. Among them, one of the most promising solutions consists of reducing the data used for establishing a classification rule (training data) by means of selecting relevant prototypes. Many prototype selection methods exist in the literature and the research in this area is still advancing. Different properties could be observed in the definition of them, but no formal categorization has been established yet. This paper provides a survey of the prototype selection methods proposed in the literature from a theoretical and empirical point of view. Considering a theoretical point of view, we propose a taxonomy based on the main characteristics presented in prototype selection and we analyze their advantages and drawbacks. Empirically, we conduct an experimental study involving different sizes of data sets for measuring their performance in terms of accuracy, reduction capabilities, and runtime. The results obtained by all the methods studied have been verified by nonparametric statistical tests. Several remarks, guidelines, and recommendations are made for the use of prototype selection for nearest neighbor classification.