Automated content labeling using context in email

  • Authors:
  • Aravindan Raghuveer

  • Affiliations:
  • Yahoo!, Bangalore, India

  • Venue:
  • Proceedings of the 17th International Conference on Management of Data
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Through a recent survey, we observe that a significant percentage of people still share photos through email. When an user composes an email with an attachment, he/she most likely talks about the attachment in the body of the email. In this paper, we develop a supervised machine learning framework to extract keywords relevant to the attachments from the body of the email. The extracted keywords can then be stored as an extended attribute to the file on the local filesystem. Both desktop indexing software and web-based image search portals can leverage the context-enriched keywords to enable a richer search experience. Our results on the public Enron email dataset shows that the proposed extraction framework provides both high precision and recall. As a proof of concept, we have also implemented a version of the proposed algorithms as a Mozilla Thunderbird Add-on.