Automatic foldering of email messages: a combination approach

  • Authors:
  • Tony Tam;Artur Ferreira;André Lourenço

  • Affiliations:
  • Instituto Superior de Engenharia de Lisboa, Lisboa, Portugal;Instituto Superior de Engenharia de Lisboa, Lisboa, Portugal and Instituto de Telecomunicações, Lisboa, Portugal;Instituto Superior de Engenharia de Lisboa, Lisboa, Portugal and Instituto de Telecomunicações, Lisboa, Portugal

  • Venue:
  • ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic organization of email messages into folders is both an open problem and challenge for machine learning techniques. Besides the effect of email overload, which affects many email users worldwide, there are some increasing difficulties caused by the semantics applied by each user. The varying number of folders and their meaning are personal and in many cases pose difficulties to learning methods. This paper addresses automatic organization of email messages into folders, based on supervised learning algorithms. The textual fields of the email message (subject and body) are considered for learning, with different representations, feature selection methods, and classifiers. The participant fields are embedded into a vector-space model representation. The classification decisions from the different email fields are combined by majority voting. Experiments on a subset of the Enron Corpus and on a private email data set show the significant improvement over both single classifiers on these fields as well as over previous works.