Path kernels and multiplicative updates

Authors:
Eiji Takimoto;Manfred K. Warmuth
Affiliations:
Graduate School of Information Sciences, Tohoku University, Sendai, 980-8579, Japan;Computer Science Department, University of California, Santa Cruz, CA
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 18
Cited 20

Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm

Information and Computation
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The binary exponentiated gradient algorithm for learning linear functions

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant

Artificial Intelligence - Special issue on relevance
Efficient learning with virtual threshold gates

Information and Computation
The robustness of the p-norm algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Linear hinge loss and average margin

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Competitive routing of virtual circuits with unknown duration

Journal of Computer and System Sciences
Predicting nearly as well as the best pruning of a decision tree through dynamic programming scheme

Theoretical Computer Science
Predicting nearly as well as the best pruning of a planar decision graph

Theoretical Computer Science
Direct and indirect algorithms for on-line learning of disjunctions

Theoretical Computer Science
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Dynamic routing on networks with fixed-size buffers

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
On-line Network Routing

Developments from a June 1996 seminar on Online algorithms: the state of the art

Rational Kernels: Theory and Algorithms

The Journal of Machine Learning Research
Moment Kernels for Regular Distributions

Machine Learning
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Online kernel PCA with entropic matrix updates

Proceedings of the 24th international conference on Machine learning
Online linear optimization and adaptive routing

Journal of Computer and System Sciences
Efficiency versus convergence of Boolean kernels for on-line learning algorithms

Journal of Artificial Intelligence Research
Learning Permutations with Exponential Weights

The Journal of Machine Learning Research
Learning permutations with exponential weights

COLT'07 Proceedings of the 20th annual conference on Learning theory
Detecting Management Fraud in Public Companies

Management Science
Online algorithms for the newsvendor problem with and without censored demands

FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics
Algorithms for adversarial bandit problems with multiple plays

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Combining initial segments of lists

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Online allocation with risk information

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
On following the perturbed leader in the bandit setting

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
The shortest path problem under partial monitoring

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Tracking the best of many experts

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Leaving the span

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Optimum follow the leader algorithm

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Combinatorial bandits

Journal of Computer and System Sciences
Combining initial segments of lists

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Kernels are typically applied to linear algorithms whose weight vector is a linear combination of the feature vectors of the examples. On-line versions of these algorithms are sometimes called "additive updates" because they add a multiple of the last feature vector to the current weight vector.In this paper we have found a way to use special convolution kernels to efficiently implement "multiplicative" updates. The kernels are defined by a directed graph. Each edge contributes an input. The inputs along a path form a product feature and all such products build the feature vector associated with the inputs.We also have a set of probabilities on the edges so that the outflow from each vertex is one. We then discuss multiplicative updates on these graphs where the prediction is essentially a kernel computation and the update contributes a factor to each edge. After adding the factors to the edges, the total outflow out of each vertex is not one any more. However some clever algorithms re-normalize the weights on the paths so that the total outflow out of each vertex is one again. Finally, we show that if the digraph is built from a regular expressions, then this can be used for speeding up the kernel and re-normalization computations.We reformulate a large number of multiplicative update algorithms using path kernels and characterize the applicability of our method. The examples include efficient algorithms for learning disjunctions and a recent algorithm that predicts as well as the best pruning of a series parallel digraphs.