Active learning using on-line algorithms

Authors:
Chris Mesterharm;Michael J. Pazzani
Affiliations:
Rutgers, The State University of New Jersey, Piscataway, NJ, USA;Rutgers, The State University of New Jersey, Piscataway, NJ, USA
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 17
Cited 2

Mistake bounds and logarithmic linear-threshold learning algorithms

Mistake bounds and logarithmic linear-threshold learning algorithms
From on-line to batch learning

COLT '89 Proceedings of the second annual workshop on Computational learning theory
A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Support-Vector Networks

Machine Learning
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Large margin classification using the perceptron algorithm

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Efficient learning with virtual threshold gates

Information and Computation
Making large-scale support vector machine learning practical

Advances in kernel methods
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
A new approximate maximal margin classification algorithm

The Journal of Machine Learning Research
Queries revisited

Theoretical Computer Science - Special issue: Algorithmic learning theory
Agnostic active learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Worst-Case Analysis of Selective Sampling for Linear Classification

The Journal of Machine Learning Research
Improving on-line learning

Improving on-line learning
Importance weighted active learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Active learning with committees for text categorization

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
On the generalization ability of on-line learning algorithms

IEEE Transactions on Information Theory

Stream-based event prediction using bayesian and bloom filters

Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new technique and analysis for using on-line learning algorithms to solve active learning problems. Our algorithm is called Active Vote, and it works by actively selecting instances that force several perturbed copies of an on-line algorithm to make mistakes. The main intuition for our result is based on the fact that the number of mistakes made by the optimal on-line algorithm is a lower bound on the number of labels needed for active learning. We provide performance bounds for Active Vote in both a batch and on-line model of active learning. These performance bounds depend on the algorithm having a set of unlabeled instances in which the various perturbed on-line algorithms disagree. The motivating application for Active Vote is an Internet advertisement rating program. We conduct experiments using data collected for this advertisement problem along with experiments using standard datasets. We show Active Vote can achieve an order of magnitude decrease in the number of labeled instances over various passive learning algorithms such as Support Vector Machines.