A statistical model for domain-independent text segmentation

  • Authors:
  • Masao Utiyama;Hitoshi Isahara

  • Affiliations:
  • Communications Research Laboratory, Soraku-gun, Kyoto, Japan;Communications Research Laboratory, Soraku-gun, Kyoto, Japan

  • Venue:
  • ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a statistical method that finds the maximum-probability segmentation of a given text. This method does not require training data because it estimates probabilities from the given text. Therefore, it can be applied to any text in any domain. An experiment showed that the method is more accurate than or at least as accurate as a state-of-the-art text segmentation system.