Content-based mobile spam classification using stylistically motivated features

  • Authors:
  • Dae-Neung Sohn;Jung-Tae Lee;Kyoung-Soo Han;Hae-Chang Rim

  • Affiliations:
  • Department of Computer and Radio Communications Engineering, Korea University, 1, 5-ga, Anam-dong, Seongbuk-gu, Seoul 136-713, South Korea;Department of Computer and Radio Communications Engineering, Korea University, 1, 5-ga, Anam-dong, Seongbuk-gu, Seoul 136-713, South Korea;Division of Computer Engineering, Sungkyul University, 400-10, Anyang 8-dong, Manan-gu, Anyang-si, Gyeonggi-do 430-742, South Korea;Department of Computer and Radio Communications Engineering, Korea University, 1, 5-ga, Anam-dong, Seongbuk-gu, Seoul 136-713, South Korea

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2012

Quantified Score

Hi-index 0.11

Visualization

Abstract

The feature of brevity in mobile phone messages makes it difficult to distinguish lexical patterns to identify spam. This paper proposes a novel approach to spam classification of extremely short messages using not only lexical features that reflect the content of a message but new stylistic features that indicate the manner in which the message is written. Experiments on two mobile phone message collections in two different languages show that the approach outperforms previous content-based approaches significantly, regardless of language.