A Spam Filtering Method Learning from Web Browsing Behavior

  • Authors:
  • Taiki Takashita;Tsuyoshi Itokawa;Teruaki Kitasuka;Masayoshi Aritsugi

  • Affiliations:
  • Department of Computer Science and Communication Engineering, Graduate School of Science and Technology, Kumamoto University, Kumamoto, Japan 860-8555;Department of Computer Science and Communication Engineering, Graduate School of Science and Technology, Kumamoto University, Kumamoto, Japan 860-8555;Department of Computer Science and Communication Engineering, Graduate School of Science and Technology, Kumamoto University, Kumamoto, Japan 860-8555;Department of Computer Science and Communication Engineering, Graduate School of Science and Technology, Kumamoto University, Kumamoto, Japan 860-8555

  • Venue:
  • KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper a spam filtering method is proposed. We focus on user behavior that most email users browse the Web. The method reduces troublesome maintenance of the spam filter, since the filter learns from Web browsing behavior in the background. The method uses Web browsing behavior of each user to learn ham words. Ham words are picked up from browsed Web pages using TF-IDF and stored in the database called ham words list. For each received email, the method extracts keywords from the email, including Web pages of the URLs. If some keywords are in the ham words list, the email is treated as a ham. In our experiments, several spam emails which cannot be detected by a Bayesian filter are detected as spams.