A text-extraction based summarizer

  • Authors:
  • Tomek Strzalkowski;Gees C. Stein;G. Bowden Wise

  • Affiliations:
  • GE Corporate Research & Development, Niskayuna, NY;GE Corporate Research & Development, Niskayuna, NY;GE Corporate Research & Development, Niskayuna, NY

  • Venue:
  • TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an automated method of generating human-readable summaries from a variety of text documents including newspaper articles, business reports, government documents, even broadcast news transcripts. Our approach exploits an empirical observation that much of the written text display certain regularities of organization and style, which we call the Discourse Macro Structure (DMS). A summary is therefore created to reflect the components of a given DMS. In order to produce a coherent and readable summary we select continuous, well-formed passages from the source document and assemble them into a mini-document within a DMS template. In this paper we describe an automated summarizer that can generate both short indicative abstracts, useful for quick scanning of a list of documents, as well as longer informative digests that can serve as surrogates for the full text. The summarizer can assist the users of an information retrieval system in assessing the quality of the results returned from a search, preparing reports and memos for their customers, and even building more effective search queries.