Virtual vector machine for Bayesian online classification

Authors:
Thomas P. Minka;Rongjing Xiang;Yuan (Alan) Qi
Affiliations:
Microsoft Research, Cambridge, UK;Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN
Venue:
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Year:
2009

Citing 7
Cited 3

A Bayesian approach to on-line learning

On-line learning in neural networks
Sparse on-line Gaussian processes

Neural Computation
Expectation Propagation for approximate Bayesian inference

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Numerical computation of rectangular bivariate and trivariate normal and t probabilities

Statistics and Computing
A Second-Order Perceptron Algorithm

SIAM Journal on Computing
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Window-based expectation propagation for adaptive signal detection in flat-fading channels

IEEE Transactions on Wireless Communications

Confidence-weighted linear classification for text categorization

The Journal of Machine Learning Research
Online learning with multiple kernels: A review

Neural Computation
Adaptive regularization of weight vectors

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a typical online learning scenario, a learner is required to process a large data stream using a small memory buffer. Such a requirement is usually in conflict with a learner's primary pursuit of prediction accuracy. To address this dilemma, we introduce a novel Bayesian online classification algorithm, called the Virtual Vector Machine. The virtual vector machine allows you to smoothly trade-off prediction accuracy with memory size. The virtual vector machine summarizes the information contained in the preceding data stream by a Gaussian distribution over the classification weights plus a constant number of virtual data points. The virtual data points are designed to add extra non-Gaussian information about the classification weights. To maintain the constant number of virtual points, the virtual vector machine adds the current real data point into the virtual point set, merges two most similar virtual points into a new virtual point or deletes a virtual point that is far from the decision boundary. The information lost in this process is absorbed into the Gaussian distribution. The extra information provided by the virtual points leads to improved predictive accuracy over previous online classification algorithms.