Extractive email thread summarization: can we do better than he said she said?

  • Authors:
  • Pablo Ariel Duboue

  • Affiliations:
  • du College, Montreal, Quebec

  • Venue:
  • INLG '12 Proceedings of the Seventh International Natural Language Generation Conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Human-written, good quality extractive summaries pay great attention to the text intermixing the extracts. In this work, we focused on the lexical choice for verbs introducing quoted text. We analyzed 4000+ high quality summaries for a high traffic mailing list and manually assembled 39 quotation-introducing verb classes that cover the majority of the verb occurrences. A significant amount of the data is covered by on-going work on e-mail "speech acts." However, we found that one third of the "tail" is composed by "risky" verbs that most likely will be beyond the state of the art for longer time. We used this fact to highlight the trade-offs of risk taking in NLG, where interesting prose might come at the cost of unsettling some of the readers.