Active Learning Image Spam Hunter

  • Authors:
  • Yan Gao;Alok Choudhary

  • Affiliations:
  • Dept. of EECS, Northwestern University, Evanstion, USA;Dept. of EECS, Northwestern University, Evanstion, USA

  • Venue:
  • ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Image spam is annoying email users around the world. Most previous work for image spam detection focuses on supervised learning approaches. However, it is costly to get enough trustworthy labels for learning, especially for an adversarial problem where spammers constantly modify patterns to evade the classifier. To address this issue, we employ the principle of active learning where the learner guides the user to label as few images as possible while maximizing the classification accuracy. Active learning is more suited for online image spam filtering since it dramatically reduces the labeling costs with negligible overhead while maintaining high recognition performance. We present and compare two active learning algorithms, based on an SVM and a Gaussian process classifier respectively. To the best of our knowledge, we are the first to apply active learning for the task of spam image filtering. Experimental results demonstrate that our active learning based approaches quickly achieve 99% high detection rate and