Detecting and filtering instant messaging spam: a global and personalized approach

  • Authors:
  • Zhijun Liu;Weili Lin;Na Li;David Lee

  • Affiliations:
  • Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio;Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio;Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio;Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio

  • Venue:
  • NPSEC'05 Proceedings of the First international conference on Secure network protocols
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

While Instant Message (IM) is gaining its popularity it is exposed to increasingly severe security threats. A serious problem is IM spam (spim) that is unsolicited commercial messages sent via IM messengers. Unlike email spam (unsolicited bulk e-mails), which has been a serious security issue for a long time and a number of techniques have been proposed to cope with, spim has not received adequate attention from the research community yet, and traditional spam filtering techniques are not directly applicable to spim due to its presence information and real time nature. In this paper, we present a new architecture for detecting and filtering spim. With the unique infrastructure of IM systems spim detection and filtering can be achieved not only at the client (receiver) side - for a personalized filtering - but also at the server side and various IM gateways - for a global filtering. Our technique integrates a number of mature spam defending techniques with modifications for IM applications, such as Black/White List, collaborative feedback based filtering, content-based technique, and challenge-response based filtering. We also design and implement new techniques for efficient spim detection and filtering, including filtering methods based on IM sending rate, content based spim defending techniques, fingerprint vector based filtering, text comparison filtering, and Bayesian filtering. We provide an analysis of their performances based on experimental results.