Decontamination of Training Samples for Supervised Pattern Recognition Methods

Authors:
Ricardo Barandela;Eduardo Gasca
Affiliations:
-;-
Venue:
Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Year:
2000

Citing 9
Cited 8

Computer processing of remotely-sensed images: an introduction

Computer processing of remotely-sensed images: an introduction
Some applications of clustering in the design of neural networks

Pattern Recognition Letters
A new method of optimizing prototypes for nearest neighbor classifiers using a multi-layer network

Pattern Recognition Letters
Enhancements to the data mining process

Enhancements to the data mining process
Prototype selection for the nearest neighbour rule through proximity graphs

Pattern Recognition Letters
Outliers in statistical pattern recognition and an application to automatic chromosome classification

Pattern Recognition Letters
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Filtering of multivariate samples containing “outliers” for clustering

Pattern Recognition Letters
Identifying and eliminating mislabeled training instances

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Some Experiments in Supervised Pattern Recognition with Incomplete Training Samples

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
On Filtering the Training Prototypes in Nearest Neighbour Classification

CCIA '02 Proceedings of the 5th Catalonian Conference on AI: Topics in Artificial Intelligence
RANSAC-based training data selection for emotion recognition from spontaneous speech

Proceedings of the 3rd international workshop on Affective interaction in natural environments
Prototype reduction techniques: A comparison among different approaches

Expert Systems with Applications: An International Journal
Label noise-tolerant hidden Markov models for segmentation: application to ECGs

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
RANSAC-based training data selection on spectral features for emotion recognition from spontaneous speech

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Label-Noise robust logistic regression and its applications

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Estimating mutual information for feature selection in the presence of label noise

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present work discusses what have been called 'imperfectly supervised situations': pattern recognition applications where the assumption of label correctness does not hold for all the elements of the training sample. A methodology for contending with these practical situations and to avoid their negative impact on the performance of supervised methods is presented. This methodology can be regarded as a cleaning process removing some suspicious instances of the training sample or correcting the class labels of some others while retaining them. It has been conceived for doing classification with the Nearest Neighbor rule, a supervised nonparametric classifier that combines conceptual simplicity and an asymptotic error rate bounded in terms of the optimal Bayes error. However, initial experiments concerning the learning phase of a Multilayer Perceptron (not reported in the present work) seem to indicate a broader applicability. Results with both simulated and real data sets are presented to support the methodology and to clarify the ideas behind it. Related works are briefly reviewed and some issues deserving further research are also exposed.