Catching the picospams

  • Authors:
  • Matthew Chang;Chung Keung Poon

  • Affiliations:
  • Dept. of Computer Science, City U. of Hong Kong, China;Dept. of Computer Science, City U. of Hong Kong, China

  • Venue:
  • ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of filtering unsolicited bulk emails, also known as spam emails. We apply a k-NN algorithm with a similarity measure called resemblance and compare it with the naive Bayes and the k-NN algorithm with TF-IDF weighting. Experimental evaluation shows that our method produces the lowest-cost results under different cost models of classification. Compared with TF-IDF weighting, our method is more practical in a dynamic environment. Also, our method successfully catches a notorious class of spams called picospams. We believe that it will be a useful member in a hybrid classifier.