Comparing topiary-style approaches to headline generation

Authors:
Ruichao Wang;Nicola Stokes;William P. Doran;Eamonn Newman;Joe Carthy;John Dunnion
Affiliations:
Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland;Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland;Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland;Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland;Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland;Intelligent Information Retrieval Group, Department of Computer Science, University College Dublin, Ireland
Venue:
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Year:
2005

Citing 8
Cited 2

Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
OCELOT: a system for summarizing Web pages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The Design and Implementation of a Part of Speech Tagger for English

The Design and Implementation of a Part of Speech Tagger for English
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A new probabilistic model for title generation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Hedge Trimmer: a parse-and-trim approach to headline generation

HLT-NAACL-DUC '03 Proceedings of the HLT-NAACL 03 on Text summarization workshop - Volume 5

Multi-candidate reduction: Sentence compression as a tool for document summarization tasks

Information Processing and Management: an International Journal
Syntactic sentence compression in the biomedical domain: facilitating access to related articles

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we compare a number of Topiary-style headline generation systems. The Topiary system, developed at the University of Maryland with BBN, was the top performing headline generation system at DUC 2004. Topiary-style headlines consist of a number of general topic labels followed by a compressed version of the lead sentence of a news story. The Topiary system uses a statistical learning approach to finding topic labels for headlines, while our approach, the LexTrim system, identifies key summary words by analysing the lexical cohesive structure of a text. The performance of these systems is evaluated using the ROUGE evaluation suite on the DUC 2004 news stories collection. The results of these experiments show that a baseline system that identifies topic descriptors for headlines using term frequency counts outperforms the LexTrim and Topiary systems. A manual evaluation of the headlines also confirms this result.