Improved Online Support Vector Machines Spam Filtering Using String Kernels

  • Authors:
  • Ola Amayri;Nizar Bouguila

  • Affiliations:
  • Concordia University, Montreal, Canada H3G 2W1;Concordia University, Montreal, Canada H3G 2W1

  • Venue:
  • CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major bottleneck in electronic communications is the enormous dissemination of spam emails. Developing of suitable filters that can adequately capture those emails and achieve high performance rate become a main concern. Support vector machines (SVMs) have made a large contribution to the development of spam email filtering. Based on SVMs, the crucial problems in email classification are feature mapping of input emails and the choice of the kernels. In this paper, we present thorough investigation of several distance-based kernels and propose the use of string kernels and prove its efficiency in blocking spam emails. We detail a feature mapping variants in text classification (TC) that yield improved performance for the standard SVMs in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering.