Multi-document summarization based on BE-Vector clustering

  • Authors:
  • Dexi Liu;Yanxiang He;Donghong Ji;Hua Yang

  • Affiliations:
  • School of Computer, Wuhan University, Wuhan, P.R. China;School of Computer, Wuhan University, Wuhan, P.R. China;Center for Study of Language and Information, Wuhan University, Wuhan, P.R. China;School of Computer, Wuhan University, Wuhan, P.R. China

  • Venue:
  • CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel multi-document summarization strategy based on Basic Element (BE) vector clustering. In this strategy, sentences are represented by BE vectors instead of word or term vectors before clustering. BE is a head-modifier-relation triple representation of sentence content, and it is more precise to use BE as semantic unit than to use word. The BE-vector clustering is realized by adopting the k-means clustering method, and a novel clustering analysis method is employed to automatically detect the number of clusters, K. The experimental results indicate a superiority of the proposed strategy over the traditional summarization strategy based on word vector clustering. The summaries generated by the proposed strategy achieve a ROUGE-1 score of 0.37291 that is better than those generated by traditional strategy (at 0.36936) on DUC04 task-2.