Sentence-Level attachment prediction

  • Authors:
  • M-Dyaa Albakour;Udo Kruschwitz;Simon Lucas

  • Affiliations:
  • School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, UK;School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, UK;School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, UK

  • Venue:
  • IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Attachment prediction is the task of automatically identifying email messages that should contain an attachment. This can be useful to tackle the problem of sending out emails but forgetting to include the relevant attachment (something that happens all too often). A common Information Retrieval (IR) approach in analyzing documents such as emails is to treat the entire document as a bag of words. Here we propose a finer-grained analysis to address the problem. We aim at identifying individual sentences within an email that refer to an attachment. If we detect any such sentence, we predict that the email should have an attachment. Using part of the Enron corpus for evaluation we find that our finer-grained approach outperforms previously reported document-level attachment prediction in similar evaluation settings. A second contribution this paper makes is to give another successful example of the ‘wisdom of the crowd’ when collecting annotations needed to train the attachment prediction algorithm. The aggregated non-expert judgements collected on Amazon’s Mechanical Turk can be used as a substitute for much more costly expert judgements.