Graph-based ranking algorithms for sentence extraction, applied to text summarization
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Robust textual inference via graph matching
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
Evaluation of a sentence ranker for text summarization based on Roget's thesaurus
TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Toward a gold standard for extractive text summarization
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Hi-index | 0.01 |
We study how two graph algorithms apply to topic-driven summarization in the scope of Document Understanding Conferences. The DUC 2005 and 2006 tasks were to summarize into 250 words a collection of documents on a topic consisting of a few statements or questions. Our algorithms select sentences for extraction. We measure their performance on the DUC 2005 test data, using the Summary Content Units made available after the challenge. One algorithm matches a graph representing the entire topic against each sentence in the collection. The other algorithm checks, for pairs of open-class words in the topic, whether they can be connected in the syntactic graph of each sentence. Matching performs better than connecting words, but a combination of both methods works best. They also both favour longer sentences, which makes summaries more fluent.