Improving on-line learning

Authors:
Haym Hirsh;Jon Christian Mesterharm
Affiliations:
Rutgers The State University of New Jersey - New Brunswick;Rutgers The State University of New Jersey - New Brunswick
Venue:
Improving on-line learning
Year:
2007

Citing 0
Cited 1

Active learning using on-line algorithms

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this dissertation, we consider techniques to improve the performance and applicability of algorithms used for on-line learning. We organize these techniques according to the assumptions they make about the generation of instances. Our first assumption is that the instances are generated by a fixed distribution. Many algorithms are designed to perform well when instances are generated by an adversary; we give two techniques to modify these algorithms to improve performance when the instances are instead generated by a distribution. We validate these techniques with extensive experiments using a wide range of real world data sets. Our second assumption is that the target concept the algorithm is attempting to learn changes over time. We give a modification of the Winnow algorithm and show it has good bounds for tracking a shifting concept when instances are generated by an adversary. We also consider the case that the instances are generated by a shifting distribution. We apply variations of the previous fixed distribution techniques and show, with real data derived experiments, that these techniques continue to significantly improve performance. Last, we assume that the labels for instances may be delayed for a number of trials. We give techniques to modify an on-line algorithm so that it has good performance even when the labels are delayed. We derive upper-bounds on the performance of these modifications and show through lower-bounds that these modifications are close to optimal.