Detecting image based spam email

  • Authors:
  • Wanli Ma;Dat Tran;Dharmendra Sharma

  • Affiliations:
  • School of Information Sciences and Engineering, University of Canberra, Australia;School of Information Sciences and Engineering, University of Canberra, Australia;School of Information Sciences and Engineering, University of Canberra, Australia

  • Venue:
  • ICHIT'06 Proceedings of the 1st international conference on Advances in hybrid information technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Image based spam email can easily circumvent widely used text based spam email filters. More and more spammers are adapting the technology. Being able to detect the nature of email from its image content is urgently needed. We propose to use OCR (optical character recognition) technology to extract the embedded text from the images and then assess the nature of the email by the extracted text using the same text based engine. This approach avoids maintaining an extra image based detection engine and also takes the benefit of the strong and reasonably mature text based engine. The success of this approach relies on the accuracy of the OCR. However, regardless of how good an OCR is, misrecognition is unavoidable. Therefore, a Markov model which has the ability to tolerate misspells is also proposed. The solution proposed in this paper can be integrated smoothly into existing spam email filters.