Text segmentation with multiple surface linguistic cues

  • Authors:
  • Mochizuki Hajime;Honda Takeo;Okumura Manabu

  • Affiliations:
  • Japan Advanced Institute of Science and Technology, Ishikawa, Japan;Japan Advanced Institute of Science and Technology, Ishikawa, Japan;Japan Advanced Institute of Science and Technology, Ishikawa, Japan

  • Venue:
  • COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

In general, a certain range of sentences in a text, is widely assumed to form a coherent unit which is called a discourse segment. Identifying the segment boundaries is a first step to recognize the structure of a text. In this paper, we describe a method for identifying segment boundaries of a Japanese text with the aid of multiple surface linguistic cues, though our experiments might be small-scale. We also present a method of training the weights for multiple linguistic cues automatically without the overfitting problem.