Automatic text summarization based on the Global Document Annotation

  • Authors:
  • Katashi Nagao;Kôiti Hasida

  • Affiliations:
  • Sony Computer Science Laboratory Inc., Tokyo, Japan;Electrotechnical Laboratory, Ibaraki, Japan

  • Venue:
  • COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The GDA (Global Document Annotation) project proposes a tag set which allows machines to automatically infer the underlying semantic/pragmatic structure of documents. Its objectives are to promote development and spread of NLP/AI applications to render GDA-tagged documents versatile and intelligent contents, which should motivate WWW (World Wide Web) users to tag their documents as part of content authoring. This paper discusses automatic text summarization based on GDA. Its main features are a domain/style-free algorithm and personalization on summarization which reflects readers' interests and preferences. In order to calculate the importance score of a text element, the algorithm uses spreading activation on an intradocument network which connects text elements via thematic, rhetorical, and coreferential relations. The proposed method is flexible enough to dynamically generate summaries of various sizes. A summary browser supporting personalization is reported as well.