SciSumm: a multi-document summarization system for scientific articles

  • Authors:
  • Nitin Agarwal;Kiran Gvr;Ravi Shankar Reddy;Carolyn Penstein Rosé

  • Affiliations:
  • Carnegie Mellon University;Language Technologies Resource Center, IIIT-Hyderabad, India;Language Technologies Resource Center, IIIT-Hyderabad, India;Carnegie Mellon University

  • Venue:
  • HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this demo, we present SciSumm, an interactive multi-document summarization system for scientific articles. The document collection to be summarized is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each article based on queries generated from the context surrounding the co-cited list of papers. This analysis enables the generation of an overview of common themes from the co-cited papers that relate to the context in which the co-citation was found. SciSumm is currently built over the 2008 ACL Anthology, however the generalizable nature of the summarization techniques and the extensible architecture makes it possible to use the system with other corpora where a citation network is available. Evaluation results on the same corpus demonstrate that our system performs better than an existing widely used multi-document summarization system (MEAD).