Extracting multi-document summaries with a double clustering approach

  • Authors:
  • Sara Botelho Silveira;António Branco

  • Affiliations:
  • University of Lisbon, Portugal,Edifício C6, Departamento de Informática Faculdade de Ciências, Universidade de Lisboa Campo Grande, Lisboa, Portugal;University of Lisbon, Portugal,Edifício C6, Departamento de Informática Faculdade de Ciências, Universidade de Lisboa Campo Grande, Lisboa, Portugal

  • Venue:
  • NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a method for extractive multi-document summarization that explores a two-phase clustering approach. First, sentences are clustered by similarity, and one sentence per cluster is selected, to reduce redundancy. Then, in order to group them according to topics, those sentences are clustered considering the collection of keywords. Additionally, the summarization process further includes a sentence simplification step, which aims not only to create simpler and more incisive sentences, but also to make room for the inclusion of relevant content in the summary as much as possible.