TextTiling: A Quantitative Approach to Discourse

  • Authors:
  • Marti A. Hearst

  • Affiliations:
  • -

  • Venue:
  • TextTiling: A Quantitative Approach to Discourse
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper represents TextTiling, a method for partitioning full-length text documents into coherent multiparagraph units. The layout of text tiles is meant to reflect the pattern of subtropics contained in an expository text. The approach uses lexical analyses based on tfidf, and information retrieval measurement, to determine the extent of the tiles, incorporating thesaural information via a statistical disambiguation algorithm. The tiles have been found to correspond will to human judgements of the major subtopic boundaries of science magazine articles.