Chunking for massive nonlinear kernel classification

  • Authors:
  • O. L. Mangasarian;M. E. Thompson

  • Affiliations:
  • Computer Sciences Department, University of Wisconsin, Madison, WI, USA,Department of Mathematics, University of California at San Diego, La Jolla, CA, USA;Computer Sciences Department, University of Wisconsin, Madison, WI, USA

  • Venue:
  • Optimization Methods & Software
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A chunking procedure [Bradley, P.S. and Mangasarian, O.L., 2000, Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13, 1-10. Available online at: ftp://ftp.cs.wisc.edu/mathprog/tech-reports/98-05.ps] utilized in [Mangasarian, O.L. and Thompson, M.E., 2006, Massive data classification via unconstrained support vector machines. Journal of Optimization Theory and Applications, 131, 315-325. Available online at: ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/06-01.pdf] for linear classifiers is proposed here for nonlinear kernel classification of massive datasets. A highly accurate algorithm based on nonlinear support vector machines that utilize a linear programming formulation [Mangasarian, O.L., 2000, Generalized support vector machines. In: A. Smola, P. Bartlett, B. Scholkopf and D. Schuurmans (Eds) Advances in Large Margin Classifiers (Cambridge, MA: MIT Press), pp. 135-146. Available online at: ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps] is developed here as a completely unconstrained minimization problem [Mangasarian, O.L., 2005, Exact 1-Norm support vector machines via unconstrained convex differentiable minimization. Technical Report 05-03, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin. Available online at: ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/05-03.ps. Journal of Machine Learning Research, 7, 1517-1530, 2006.]. This approach together with chunking leads to a simple and accurate method for generating nonlinear classifiers for a 250,000-point dataset that typically exceeds machine capacity when standard linear programming methods such as CPLEX [ILOG, 2003, ILOG CPLEX 9.0 User's Manual, Incline Village, Nevada. Available online at: http://www.ilog.com/products/cplex/] are used. Because a 1-norm support vector machine underlies the proposed method, the approach together with a reduced support vector machine formulation [Lee, Y.-J. and Mangasarian, O.L., 2001, RSVM: reduced support vector machines. Proceedings of the First SIAM International Conference on Data Mining, Chicago, 5-7 April, CD-ROM. Available online at: ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-07.ps] minimizes the number of kernel functions utilized to generate a simplified nonlinear classifier.