Generic technologies for single- and multi-document summarization

  • Authors:
  • Marie-Francine Moens;Roxana Angheluta;Jos Dumortier

  • Affiliations:
  • Interdisciplinary Centre for Law & IT (ICRI), Katholieke Universiteit Leuven, Tiensestraat 41, B-3000 Leuven, Blegium;Interdisciplinary Centre for Law & IT (ICRI), Katholieke Universiteit Leuven, Tiensestraat 41, B-3000 Leuven, Blegium;Interdisciplinary Centre for Law & IT (ICRI), Katholieke Universiteit Leuven, Tiensestraat 41, B-3000 Leuven, Blegium

  • Venue:
  • Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The technologies for single- and multi-document summarization that are described and evaluated in this article can be used on heterogeneous texts for different summarization tasks. They refer to the extraction of important sentences from the documents, compressing the sentences to their essential or relevant content, and detecting redundant content across sentences. The technologies are tested at the Document Understanding Conference, organized by the National Institute of Standards and Technology, USA in 2002 and 2003. The system obtained good to very good results in this competition. We tested our summarization system also on a variety of English Encyclopedia texts and on Dutch magazine articles. The results show that relying on generic linguistic resources and statistical techniques offer a basis for text summarization.