Subsequence-Based Text Segmentation and Labeling

  • Authors:
  • Xi Chen;Shihong Chen

  • Affiliations:
  • -;-

  • Venue:
  • ETCS '09 Proceedings of the 2009 First International Workshop on Education Technology and Computer Science - Volume 01
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text segmentation is important for many natural language processing tasks, such as passage retrieval and summarization. This paper uses suffix tree model for the text representation and introduces a new measure, subsequence-based coherence, to represent the coherence between sentences and utilize the word order information. This paper also introduces a text segmentation algorithm, subsequence-based maximum cut, and a passage labeling approach based on subsequences. The educational text segmentation results show that our method outperforms some of the existing methods, and the passage labeling result is approving.