FastSum: fast and accurate query-based multi-document summarization

  • Authors:
  • Frank Schilder;Ravikumar Kondadadi

  • Affiliations:
  • Thomson Corp., Eagan, MN;Thomson Corp., Eagan, MN

  • Venue:
  • HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a fast query-based multi-document summarizer called FastSum based solely on word-frequency features of clusters, documents and topics. Summary sentences are ranked by a regression SVM. The summarizer does not use any expensive NLP techniques such as parsing, tagging of names or even part of speech information. Still, the achieved accuracy is comparable to the best systems presented in recent academic competitions (i.e., Document Understanding Conference (DUC)). Because of a detailed feature analysis using Least Angle Regression (LARS), FastSum can rely on a minimal set of features leading to fast processing times: 1250 news documents in 60 seconds.