Large-scale machine learning using kernel methods

  • Authors:
  • Edward Y. Chang;Gang Wu

  • Affiliations:
  • University of California, Santa Barbara;University of California, Santa Barbara

  • Venue:
  • Large-scale machine learning using kernel methods
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Kernel methods, such as Support Vector Machines (SVMs), are a core machine learning technology. They enjoy strong theoretical foundations and excellent empirical successes in many pattern-recognition applications. However, when kernel methods are applied to many emerging large-scale applications, such as video surveillance, multimedia information retrieval, and web mining, they suffer from the challenges of ineffective and inefficient training. In this dissertation, we explore these challenges and propose strategies to solve them. We first investigate the imbalanced-training challenge which causes the training of kernel methods to be ineffective. The imbalance-training problem occurs when the training instances of the target class are significantly outnumbered by the other training instances. In such situations, we show the class boundary trained from SVMs can be severely skewed toward the target class. We propose using conformal transformation on the kernel function in Reproducing Kernel Hilbert Space for tackling the challenge. The training performance of kernel methods greatly depends on the chosen kernel function or matrix. A kernel function or matrix defines a pairwise-similarity measurement between two data instances. We thus develop an algorithm to formulate a context-dependent distance function for measuring such similarity. We demonstrate that the learned distance function leads to improved performance for kernel-based clustering and classification tasks. Moreover, we also research the situations where the similarity measurement to formulate the kernel function might not induce a positive semi-definite (psd) kernel matrix, and hence cannot be used for training with kernel methods. We propose an analytical framework on evaluating several representative spectrum-transformation methods. Finally, we address the efficiency of kernel methods to achieve fast training on massive data. Especially, we focus on Support Vector Machines. The traditional solutions of SVMs suffer from the widely-known scalability problem. We propose an incremental algorithm, which performs approximate matrix-factorization operations, to speed up SVMs. Two approximate factorization schemes, Kronecker and incomplete Cholesky, are utilized in the primal-dual interior-point method (IPM) to directly solve the quadratic optimization problem in SVMs. Through theoretical analysis and extensive empirical studies, we show that our proposed approaches are able to perform more effectively, and efficiently, than traditional methods.