Active learning with irrelevant examples

Authors:
Dominic Mazzoni;Kiri L. Wagstaff;Michael C. Burl
Affiliations:
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA;Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA;Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 6
Cited 2

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Support-Vector Networks

Machine Learning
Active learning: theory and applications

Active learning: theory and applications
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Land cover change detection: a case study

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A general approach for adaptive kernels in semi-supervised clustering

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Active learning algorithms attempt to accelerate the learning process by requesting labels for the most informative items first. In real-world problems, however, there may exist unlabeled items that are irrelevant to the user's classification goals. Queries about these points slow down learning because they provide no information about the problem of interest. We have observed that when irrelevant items are present, active learning can perform worse than random selection, requiring more time (queries) to achieve the same level of accuracy. Therefore, we propose a novel approach, Relevance Bias, in which the active learner combines its default selection heuristic with the output of a simultaneously trained relevance classifier to favor items that are likely to be both informative and relevant. In our experiments on a real-world problem and two benchmark datasets, the Relevance Bias approach significantly improves the learning rate of three different active learning approaches.