A study of two graph algorithms in topic-driven summarization

Authors:
Vivi Nastase;Stan Szpakowicz
Affiliations:
University of Ottawa, Ottawa, Canada;University of Ottawa, Ottawa, Canada and Polish Academy of Sciences, Warsaw, Poland
Venue:
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Year:
2006

Citing 3
Cited 2

Graph-based ranking algorithms for sentence extraction, applied to text summarization

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Robust textual inference via graph matching

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research

Evaluation of a sentence ranker for text summarization based on Roget's thesaurus

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Toward a gold standard for extractive text summarization

AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

We study how two graph algorithms apply to topic-driven summarization in the scope of Document Understanding Conferences. The DUC 2005 and 2006 tasks were to summarize into 250 words a collection of documents on a topic consisting of a few statements or questions. Our algorithms select sentences for extraction. We measure their performance on the DUC 2005 test data, using the Summary Content Units made available after the challenge. One algorithm matches a graph representing the entire topic against each sentence in the collection. The other algorithm checks, for pairs of open-class words in the topic, whether they can be connected in the syntactic graph of each sentence. Matching performs better than connecting words, but a combination of both methods works best. They also both favour longer sentences, which makes summaries more fluent.