Large Scale Kernel Regression via Linear Programming

Authors:
O. L. Mangasarian;David R. Musicant
Affiliations:
Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI 53706, USA. olvi@cs.wisc.edu;Department of Mathematics and Computer Science, Carleton College, One North College Street, Northfield, MN 55057, USA. dmusican@carleton.edu
Venue:
Machine Learning
Year:
2002

Citing 9
Cited 16

The nature of statistical learning theory

The nature of statistical learning theory
Support-Vector Networks

Machine Learning
Improved generalization via tolerant training

Journal of Optimization Theory and Applications
Advances in kernel methods: support vector learning

Advances in kernel methods: support vector learning
Support vector density estimation

Advances in kernel methods
Combining support vector and mathematical programming methods for classification

Advances in kernel methods
Shrinking the tube: a new support vector regression algorithm

Proceedings of the 1998 conference on Advances in neural information processing systems II
Learning from Data: Concepts, Theory, and Methods

Learning from Data: Concepts, Theory, and Methods
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery

Minimal kernel classifiers

The Journal of Machine Learning Research
Knowledge-Based Kernel Approximation

The Journal of Machine Learning Research
epsilon-SSVR: A Smooth Support Vector Machine for epsilon-Insensitive Regression

IEEE Transactions on Knowledge and Data Engineering
Convex Kernel Underestimation of Functions with Multiple Local Minima

Computational Optimization and Applications
Real-time ranking with concept drift using expert advice

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating prior knowledge in support vector regression

Machine Learning
Support vector regression from simulation data and few experimental samples

Information Sciences: an International Journal
Switched and PieceWise Nonlinear Hybrid System Identification

HSCC '08 Proceedings of the 11th international workshop on Hybrid Systems: Computation and Control
Quadratic programming formulations for classificationand regression

Optimization Methods & Software - THE JOINT EUROPT-OMS CONFERENCE ON OPTIMIZATION, 4-7 JULY, 2007, PRAGUE, CZECH REPUBLIC, PART II
A coordinate gradient descent method for l1-regularized convex minimization

Computational Optimization and Applications
Evolution strategies based adaptive Lp LS-SVM

Information Sciences: an International Journal
Multivariate convex support vector regression with semidefinite programming

Knowledge-Based Systems
Multiple-ν support vector regression based on spectral risk measure minimization

Neurocomputing
Estimating α-frontier technical efficiency with shape-restricted kernel quantile regression

Neurocomputing
An algorithm for training a large scale support vector machine for regression based on linear programming and decomposition methods

Pattern Recognition Letters
Learning nonlinear hybrid systems: from sparse optimization to support vector regression

Proceedings of the 16th international conference on Hybrid systems: computation and control

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of tolerant data fitting by a nonlinear surface, induced by a kernel-based support vector machine is formulated as a linear program with fewer number of variables than that of other linear programming formulations. A generalization of the linear programming chunking algorithm for arbitrary kernels is implemented for solving problems with very large datasets wherein chunking is performed on both data points and problem variables. The proposed approach tolerates a small error, which is adjusted parametrically, while fitting the given data. This leads to improved fitting of noisy data (over ordinary least error solutions) as demonstrated computationally. Comparative numerical results indicate an average time reduction as high as 26.0% over other formulations, with a maximal time reduction of 79.7%. Additionally, linear programs with as many as 16,000 data points and more than a billion nonzero matrix elements are solved.