Towards multi-document summarization of scientific articles: making interesting comparisons with SciSumm

  • Authors:
  • Nitin Agarwal;Kiran Gvr;Ravi Shankar Reddy;Carolyn Penstein Rosé

  • Affiliations:
  • Carnegie Mellon University;Language Technologies Resource Center, IIIT-Hyderabad, India;Language Technologies Resource Center, IIIT-Hyderabad, India;Carnegie Mellon University

  • Venue:
  • WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel unsupervised approach to the problem of multi-document summarization of scientific articles, in which the document collection is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each co-cited article and relevance ranking using a query generated from the context surrounding the co-cited list of papers. This analysis enables the generation of an overview of common themes from the co-cited papers that relate to the context in which the co-citation was found. We present a system called SciSumm that embodies this approach and apply it to the 2008 ACL Anthology. We evaluate this summarization system for relevant content selection using gold standard summaries prepared on principle based guidelines. Evaluation with gold standard summaries demonstrates that our system performs better in content selection than an existing summarization system (MEAD). We present a detailed summary of our findings and discuss possible directions for future research.