A Dynamic Programming Algorithm for Linear Text Segmentation

  • Authors:
  • P. Fragkou;V. Petridis;Ath. Kehagias

  • Affiliations:
  • -;-;kehagias@egnatia.ee.auth.gr

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper we introduce a dynamic programming algorithm which performs linear text segmentation by global minimization of a segmentation cost function which incorporates two factors: (a) within-segment word similarity and (b) prior information about segment length. We evaluate segmentation accuracy of the algorithm by precision, recall and Beeferman's segmentation metric. On a segmentation task which involves Choi's text collection, the algorithm achieves the best segmentation accuracy so far reported in the literature. The algorithm also achieves high accuracy on a second task which involves previously unused texts.