Using syntactic information to extract relevant terms for multi-document summarization

  • Authors:
  • Enrique Amigó;Julio Gonzalo;Víctor Peinado;Anselmo Peñas;Felisa Verdejo

  • Affiliations:
  • Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The identification of the key concepts in a set of documents is a useful source of information for several information access applications. We are interested in its application to multi-document summarization, both for the automatic generation of summaries and for interactive summarization systems.In this paper, we study whether the syntactic position of terms in the texts can be used to predict which terms are good candidates as key concepts. Our experiments show that a) distance to the verb is highly correlated with the probability of a term being part of a key concept; b) subject modifiers are the best syntactic locations to find relevant terms; and c) in the task of automatically finding key terms, the combination of statistical term weights with shallow syntactic information gives better results than statistical measures alone.