Generating summary keywords for emails using topics

  • Authors:
  • Mark Dredze;Hanna M. Wallach;Danny Puller;Fernando Pereira

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;University of Cambridge, Cambridge, UK;University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA

  • Venue:
  • Proceedings of the 13th international conference on Intelligent user interfaces
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Email summary keywords, used to concisely represent the gist of an email, can help users manage and prioritize large numbers of messages. We develop an unsupervised learning framework for selecting summary keywords from emails using latent representations of the underlying topics in a user's mailbox. This approach selects words that describe each message in the context of existing topics rather than simply selecting keywords based on a single message in isolation. We present and compare four methods for selecting summary keywords based on two well-known models for inferring latent topics: latent semantic analysis and latent Dirichlet allocation. The quality of the summary keywords is assessed by generating summaries for emails from twelve users in the Enron corpus. The summary keywords are then used in place of entire messages in two proxy tasks: automated foldering and recipient prediction. We also evaluate the extent to which summary keywords enhance the information already available in a typical email user interface by repeating the same tasks using email subject lines.