Mistake bounds and logarithmic linear-threshold learning algorithms
Mistake bounds and logarithmic linear-threshold learning algorithms
The weighted majority algorithm
Information and Computation
The binary exponentiated gradient algorithm for learning linear functions
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Machine Learning - Special issue on context sensitivity and concept drift
Linear hinge loss and average margin
Proceedings of the 1998 conference on Advances in neural information processing systems II
Predicting Nearly as well as the best Pruning of a Planar Decision Graph
ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
Hi-index | 0.01 |
It is easy to design on-line learning algorithms for learning k out of n variable monotone disjunctions by simply keeping one weight per disjunction. Such algorithms use roughly O(nk) weights which can be prohibitively expensive. Surprisingly, algorithms like Winnow require only n weights (one per variable) and the mistake bound of these algorithms is not too much worse than the mistake bound of the more costly algorithms. The purpose of this paper is to investigate how the exponentially many weights can be collapsed into only O(n) weights. In particular, we consider probabilistic assumptions that enable the Bayes optimal algorithm's posterior over the disjunctions to be encoded with only O(n) weights. This results in a new O(n) algorithm for learning disjunctions which is related to the Bylander's BEG algorithm originally introduced for linear regression. Beside providing a Bayesian interpretation for this new algorithm, we are also able to obtain mistake bounds for the noise free case resembling those that have been derived for the Winnow algorithm. The same techniques used to derive this new algorithm also provide a Bayesian interpretation for a normalized version of Winnow.