Incremental approximate matrix factorization for speeding up support vector machines

  • Authors:
  • Gang Wu;Edward Chang;Yen Kuang Chen;Christoper Hughes

  • Affiliations:
  • University of California, Santa Barbara, CA;University of California, Santa Barbara, CA;Intel Research, Santa Clara, CA;Intel Research, Santa Clara, CA

  • Venue:
  • Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional decomposition-based solutions to Support Vector Machines (SVMs) suffer from the widely-known scalability problem. For example, given a one-million training set, it takes about six days for SVMLight to run on a Pentium-4 sever with 8G-byte memory. In this paper, we propose an incremental algorithm, which performs approximate matrix-factorization operations, to speed up SVMs. Two approximate factorization schemes, Kronecker and incomplete Cholesky, are utilized in the primal-dual interior-point method (IPM) to directly solve the quadratic optimization problem in SVMs. We found out that a coarse approximate algorithm enjoys good speedup performance but may suffer from poor training accuracy. Conversely, a fine-grained approximate algorithm enjoys good training quality but may suffer from long training time. We subsequently propose an incremental training algorithm, which uses the approximate IPM solution of a coarse factorization to initialize the IPM of a fine-grained factorization. Extensive empirical studies show that our proposed incremental algorithm with approximate factorizations substantially speeds up SVM training while maintaining high training accuracy. In addition, we show that our proposed algorithm is highly parallelizable on an Intel dual-coreprocessor.