A Survey on Text Classification Techniques for E-mail Filtering

  • Authors:
  • Upasana Pandey;Shampa Chakravarty

  • Affiliations:
  • -;-

  • Venue:
  • ICMLC '10 Proceedings of the 2010 Second International Conference on Machine Learning and Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The continuing explosive growth of textual content within the World Wide Web has given rise to the need for sophisticated Text Classification (TC) techniques that combine efficiency with high quality of results. E-mail filtering is one application that has the potential to affect every user of the internet. Even though a large body of research has delved into this problem, there is a paucity of survey that indicates trends and directions. This paper attempts to categorize the prevalent popular techniques for classifying email as spam or legitimate and suggest possible techniques to fill in the lacunae. Our findings suggest that context-based email filtering has the most potential in improving quality by learning various contexts such as n-gram phrases, linguistic constructs or users’ profile based context to tailor his/her filtering scheme.