Streamed learning: one-pass SVMs

Authors:
Piyush Rai;Hal Daumé;Suresh Venkatasubramanian
Affiliations:
University of Utah, School of Computing;University of Utah, School of Computing;University of Utah, School of Computing
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 14
Cited 4

Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Convex Optimization

Convex Optimization
Approximating extent measures of points

Journal of the ACM (JACM)
Solving large scale linear prediction problems using stochastic gradient descent algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Core Vector Machines: Fast SVM Training on Very Large Data Sets

The Journal of Machine Learning Research
Data streams: algorithms and applications

Foundations and Trends® in Theoretical Computer Science
Fast Kernel Classifiers with Online and Active Learning

The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Simpler core vector machines with enclosing balls

Proceedings of the 24th international conference on Machine learning
Confidence-weighted linear classification

Proceedings of the 25th international conference on Machine learning
Maximum margin coresets for active and noise tolerant learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Large linear classification when data cannot fit in memory

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Two one-pass algorithms for data stream classification using approximate MEBs

ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part II
Selective block minimization for faster convergence of limited memory large-scale linear models

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
The shifting sands algorithm

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a streaming model for large-scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The l2-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of core sets exists (CVM) [Tsang et al., 2005]. CVM learns a (1+Ɛ)-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.