An empirical study of information synthesis tasks

  • Authors:
  • Enrique Amigó;Julio Gonzalo;Víctor Peinado;Anselmo Peñas;Felisa Verdejo

  • Affiliations:
  • Universidad Nacional de Educación a Distancia, Madrid -- Spain;Universidad Nacional de Educación a Distancia, Madrid -- Spain;Universidad Nacional de Educación a Distancia, Madrid -- Spain;Universidad Nacional de Educación a Distancia, Madrid -- Spain;Universidad Nacional de Educación a Distancia, Madrid -- Spain

  • Venue:
  • ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an empirical study of the "Information Synthesis" task, defined as the process of (given a complex information need) extracting, organizing and inter-relating the pieces of information contained in a set of relevant documents, in order to obtain a comprehensive, non redundant report that satisfies the information need.Two main results are presented: a) the creation of an Information Synthesis testbed with 72 reports manually generated by nine subjects for eight complex topics with 100 relevant documents each; and b) an empirical comparison of similarity metrics between reports, under the hypothesis that the best metric is the one that best distinguishes between manual and automatically generated reports. A metric based on key concepts overlap gives better results than metrics based on n-gram overlap (such as ROUGE) or sentence overlap.