Obtaining single document summaries using latent dirichlet allocation

  • Authors:
  • Karthik Nagesh;M. Narasimha Murty

  • Affiliations:
  • Dept of Computer Science and Automation, Indian Institute of Science, Bangalore, India;Dept of Computer Science and Automation, Indian Institute of Science, Bangalore, India

  • Venue:
  • ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a novel approach that makes use of topic models based on Latent Dirichlet allocation(LDA) for generating single document summaries. Our approach is distinguished from other LDA based approaches in that we identify the summary topics which best describe a given document and only extract sentences from those paragraphs within the document which are highly correlated given the summary topics. This ensures that our summaries always highlight the crux of the document without paying any attention to the grammar and the structure of the documents. Finally, we evaluate our summaries on the DUC 2002 Single document summarization data corpus using ROUGE measures. Our summaries had higher ROUGE values and better semantic similarity with the documents than the DUC summaries.