A classification algorithm for predicting the structure of summaries

  • Authors:
  • Horacio Saggion

  • Affiliations:
  • University of Sheffield, Sheffield, United Kingdom

  • Venue:
  • UCNLG+Sum '09 Proceedings of the 2009 Workshop on Language Generation and Summarisation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the problem of generating the structure of short domain independent abstracts. We apply a supervised machine learning approach trained over a set of abstracts collected from abstracting services and automatically annotated with a text analysis tool. We design a set of features for learning inspired from past research in content selection, information ordering, and rhetorical analysis for training an algorithm which then predicts the discourse structure of unseen abstracts. The proposed approach to the problem which combines local and contextual features is able to predict the local structure of the abstracts in just over 60% of the cases.