A study of two graph algorithms in topic-driven summarization

  • Authors:
  • Vivi Nastase;Stan Szpakowicz

  • Affiliations:
  • University of Ottawa, Ottawa, Canada;University of Ottawa, Ottawa, Canada and Polish Academy of Sciences, Warsaw, Poland

  • Venue:
  • TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

We study how two graph algorithms apply to topic-driven summarization in the scope of Document Understanding Conferences. The DUC 2005 and 2006 tasks were to summarize into 250 words a collection of documents on a topic consisting of a few statements or questions. Our algorithms select sentences for extraction. We measure their performance on the DUC 2005 test data, using the Summary Content Units made available after the challenge. One algorithm matches a graph representing the entire topic against each sentence in the collection. The other algorithm checks, for pairs of open-class words in the topic, whether they can be connected in the syntactic graph of each sentence. Matching performs better than connecting words, but a combination of both methods works best. They also both favour longer sentences, which makes summaries more fluent.