Improving "email speech acts" analysis via n-gram selection

  • Authors:
  • Vitor R. Carvalho;William W. Cohen

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In email conversational analysis, it is often useful to trace the the intents behind each message exchange. In this paper, we consider classification of email messages as to whether or not they contain certain intents or email-acts, such as "propose a meeting" or "commit to a task". We demonstrate that exploiting the contextual information in the messages can noticeably improve email-act classification. More specifically, we describe a combination of n-gram sequence features with careful message preprocessing that is highly effective for this task. Compared to a previous study (Cohen et al., 2004), this representation reduces the classification error rates by 26.4% on average. Finally, we introduce Ciranda: a new open source toolkit for email speech act prediction.