Analysis of new techniques to obtain quality training sets

Authors:
J. S. Sánchez;R. Barandela;A. I. Marqués;R. Alejo;J. Badenas
Affiliations:
Universitat Jaume I, Av. Vicent Sos Baynat s/n, 12006 Castellón, Spain;Instituto Tecnológico de Toluca, Av. Tecnológico s/n, 52140 Metepec, Mexico;Universitat Jaume I, Av. Vicent Sos Baynat s/n, 12006 Castellón, Spain;Instituto Tecnológico de Toluca, Av. Tecnológico s/n, 52140 Metepec, Mexico;Universitat Jaume I, Av. Vicent Sos Baynat s/n, 12006 Castellón, Spain
Venue:
Pattern Recognition Letters - Special issue: Sibgrapi 2001
Year:
2003

Citing 7
Cited 22

Editing for the k-nearest neighbors rule by a genetic algorithm

Pattern Recognition Letters - Special issue on genetic algorithms
A new definition of neighborhood of a point in multi-dimensional space

Pattern Recognition Letters
Prototype selection for the nearest neighbour rule through proximity graphs

Pattern Recognition Letters
On the use of neighbourhood-based non-parametric classifiers

Pattern Recognition Letters - special issue on pattern recognition in practice V
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
Embodied artificial intelligence

Artificial Intelligence
Considerations about sample-size sensitivity of a family of editednearest-neighbor rules

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Generalized training subset selection for statistical estimation of epicardial activation maps from intravenous catheter measurements

Computers in Biology and Medicine
Avoiding Boosting Overfitting by Removing Confusing Samples

ECML '07 Proceedings of the 18th European conference on Machine Learning
The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing

ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
A Scalable Noise Reduction Technique for Large Case-Based Systems

ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining

Applied Soft Computing
A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm

Pattern Recognition Letters
Noise reduction for instance-based learning with a local maximal margin approach

Journal of Intelligent Information Systems
Reduced Reward-punishment editing for building ensembles of classifiers

Expert Systems with Applications: An International Journal
A new co-training-style random forest for computer aided diagnosis

Journal of Intelligent Information Systems
Tri-training and data editing based semi-supervised clustering algorithm

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
On the use of different classification rules in an editing task

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Edition schemes based on BSE

CIARP'05 Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications
A stochastic approach to wilson's editing algorithm

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Edited nearest neighbor rule for improving neural networks classifications

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Profiling instances in noise reduction

Knowledge-Based Systems
Integrating a differential evolution feature weighting scheme into prototype generation

Neurocomputing
On the use of data filtering techniques for credit risk prediction with instance-based models

Expert Systems with Applications: An International Journal
Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification

Pattern Recognition
A hybrid intelligent approach for modeling brand choice and constructing a market response simulator

Knowledge-Based Systems
Addressing imbalanced classification with instance generation techniques: IPADE-ID

Neurocomputing
A fast prototype reduction method based on template reduction and visualization-induced self-organizing map for nearest neighbor algorithm

Applied Intelligence
On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents new algorithms to identify and eliminate mislabelled, noisy and atypical training samples for supervised learning and more specifically, for nearest neighbour classification. The main goal of these approaches is to enhance the classification accuracy by improving the quality of the training data. Several experiments with synthetic and real data sets are carried out in order to illustrate the behaviour of the schemes proposed here and compare their performance with that of other traditional techniques. It is also analysed the ability of these new algorithms to "reduce" the possible overlapping among regions of different classes.