User action based adaptive learning with weighted bayesian classification for filtering spam mail

  • Authors:
  • Hyun-Jun Kim;Jenu Shrestha;Heung-Nam Kim;Geun-Sik Jo

  • Affiliations:
  • Corporate Technology Operations, R&D IT Infra Group, Samsung Electronics, Suwon-City, Korea;Intelligent E-Commerce Systems Lab., School of Computer Science & Engineering, Inha University, Incheon, Korea;Intelligent E-Commerce Systems Lab., School of Computer Science & Engineering, Inha University, Incheon, Korea;Intelligent E-Commerce Systems Lab., School of Computer Science & Engineering, Inha University, Incheon, Korea

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, e-mail is considered one of the most important communication methods, but most users suffer from Spam mail. To solve this problem, there has been much research. The previous research showed comparatively high performance, but for adaptation of real world, it requires several improvements. First, it needs personalized learning for better performance. We cannot make a strict definition of Spam, because the definition of any context depends on each user. Second, the concept drift or interest drift problem, that is, users' interest or any context's concept, may change over time. Therefore, many Spam filtering systems are using continuous learning schemes such as adaptive learning or incremental learning. However, these systems require user feedback or rating results manually, and this inconvenience causes slow learning and performance enhancement. In this research, we developed an adaptive learning system based on an automatic weighting environment. For the automatic weight, we categorized 6 user patterns (actions) on the mailing system whose weights are automatically adapted to the learning phase. From the experiment, we will demonstrate the Bayesian classification with an adaptive learning environment. By using suggesting ideas, we will analyze the comparison result with adaptive learning. Finally, from the experiment using real world data sets, we will prove its possibility for tracking the concept and interest drift problems.