Online Kernel-Based Classification Using Adaptive Projection Algorithms

  • Authors:
  • K. Slavakis;S. Theodoridis;I. Yamada

  • Affiliations:
  • Dept. of Telecommun. Sci. & Technol., Univ. of Peloponnese, Tripoli;-;-

  • Venue:
  • IEEE Transactions on Signal Processing - Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The goal of this paper is to derive a novel online algorithm for classification in reproducing kernel hilbert spaces (RKHS) by exploiting projection-based adaptive filtering tools. The paper brings powerful convex analytic and set theoretic estimation arguments in machine learning by revisiting the standard kernel-based classification as the problem of finding a point which belongs to a closed halfspace (a special closed convex set) in an RKHS. In this way, classification in an online setting, where data arrive sequentially, is viewed as the problem of finding a point (classifier) in the nonempty intersection of an infinite sequence of closed halfspaces in the RKHS. Convex analysis is also used to introduce sparsification arguments in the design by imposing an additional simple convex constraint on the norm of the classifier. An algorithmic solution to the resulting optimization problem, where new convex constraints are added every time instant, is given by the recently introduced adaptive projected subgradient method (APSM), which generalizes a number of well-known projection-based adaptive filtering algorithms such as the classical normalized least mean squares (NLMS) and the affine projection algorithm (APA). Under mild conditions, the generated sequence of estimates enjoys monotone approximation, strong convergence, asymptotic optimality, and a characterization of the limit point. Further, we show that the additional convex constraint on the norm of the classifier naturally leads to an online sparsification of the resulting kernel series expansion. We validate the proposed design by considering the adaptive equalization problem of a nonlinear channel, and by comparing it with classical as well as with recently developed stochastic gradient descent techniques.