Exploring clustering for multi-document arabic summarisation

  • Authors:
  • Mahmoud El-Haj;Udo Kruschwitz;Chris Fox

  • Affiliations:
  • Computer Science and Electronic Engineering, University of Essex, United Kingdom;Computer Science and Electronic Engineering, University of Essex, United Kingdom;Computer Science and Electronic Engineering, University of Essex, United Kingdom

  • Venue:
  • AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we explore clustering for multi-document Arabic summarisation. For our evaluation we use an Arabic version of the DUC-2002 dataset that we previously generated using Google Translate. We explore how clustering (at the sentence level) can be applied to multi-document summarisation as well as for redundancy elimination within this process. We use different parameter settings including the cluster size and the selection model applied in the extractive summarisation process. The automatically generated summaries are evaluated using the ROUGE metric, as well as precision and recall. The results we achieve are compared with the top five systems in the DUC-2002 multi-document summarisation task.