Unsupervised data pruning for clustering of noisy data
Knowledge-Based Systems
Avoiding Boosting Overfitting by Removing Confusing Samples
ECML '07 Proceedings of the 18th European conference on Machine Learning
Active Learning Using a Constructive Neural Network Algorithm
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
Improving object detection by removing noisy samples from training sets
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Computers in Biology and Medicine
Learning assignment order of instances for the constrained K-means clustering algorithm
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A team of continuous-action learning automata for noise-tolerant learning of half-spaces
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Edited AdaBoost by weighted kNN
Neurocomputing
RANSAC-based training data selection for emotion recognition from spontaneous speech
Proceedings of the 3rd international workshop on Affective interaction in natural environments
Face verification using indirect neighbourhood components analysis
ISVC'10 Proceedings of the 6th international conference on Advances in visual computing - Volume Part II
Learning Multi-modal Similarity
The Journal of Machine Learning Research
Unsupervised video surveillance
ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
International Journal of Multimedia Data Engineering & Management
The C-loss function for pattern classification
Pattern Recognition
Hi-index | 0.00 |
Training datasets for learning of object categories are often contaminated or imperfect. We explore an approach to automatically identify examples that are noisy or troublesome for learning and exclude them from the training set. The problem is relevant to learning in semi-supervised or unsupervised setting, as well as to learning when the training data is contaminated with wrongly labeled examples or when correctly labeled, but hard to learn examples, are present. We propose a fully automatic mechanism for noise cleaning, called ýdata pruningý, and demonstrate its success on learning of human faces. It is not assumed that the data or the noise can be modeled or that additional training examples are available. Our experiments show that data pruning can improve on generalization performance for algorithms with various robustness to noise. It outperforms methods with regularization properties and is superior to commonly applied aggregation methods, such as bagging.