Combining summaries using unsupervised rank aggregation

  • Authors:
  • Girish Keshav Palshikar;Shailesh Deshpande;G. Athiappan

  • Affiliations:
  • Tata Research Development and Design Centre (TRDDC), Tata Consultancy Services Limited, Pune, India;Tata Research Development and Design Centre (TRDDC), Tata Consultancy Services Limited, Pune, India;Tata Research Development and Design Centre (TRDDC), Tata Consultancy Services Limited, Pune, India

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We model the problem of combining multiple summaries of a given document into a single summary in terms of the well-known rank aggregation problem. Treating sentences in the document as candidates and summarization algorithms as voters, we determine the winners in an election where each voter selects and ranks k candidates in order of its preference. Many rank aggregation algorithms are supervised: they discover an optimal rank aggregation function from a training dataset of where each "record" consists of a set of candidate rankings and a model ranking. But significant disagreements between model summaries created by human experts as well as high costs of creating them makes it interesting to explore the use of unsupervised rank aggregation techniques. We use the well-known Condorcet methodology, including a new variation to improve its suitability. As voters, we include summarization algorithms from literature and two new ones proposed here: the first is based on keywords and the second is a variant of the lexical-chain based algorithm in [1]. We experimentally demonstrate that the combined summary is often very similar (when compared using different measures) to the model summary produced manually by human experts.