Combining syntax and semantics for automatic extractive single-document summarization

  • Authors:
  • Araly Barrera;Rakesh Verma

  • Affiliations:
  • Computer Science Department, University of Houston, Houston, TX;Computer Science Department, University of Houston, Houston, TX

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of automated summarization is to tackle the "information overload" problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that single-document summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the importance of single-document summarization by introducing a new approach and its implementation. This approach essentially combines syntactic, semantic, and statistical methodologies, and reflects psychological findings that pinpoint specific selection patterns as humans construct summaries. Successful summary evaluation results and baseline out-performance are demonstrated when our system is executed on two separate datasets: the Document Understanding Conference (DUC) 2002 data set and a scientific magazine article set. These results have implications not only for extractive and abstractive single-document summarization, but could also be leveraged in multi-document summarization.