Morphological analysis of the spontaneous speech corpus

  • Authors:
  • Kiyotaka Uchimoto;Chikashi Nobata;Atsushi Yamada;Hitoshi Isahara;Satoshi Sekine

  • Affiliations:
  • Communications Research Laboratory, Kyoto, Japan;Communications Research Laboratory, Kyoto, Japan;Communications Research Laboratory, Kyoto, Japan;Communications Research Laboratory, Kyoto, Japan;New York University, New York, NY

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-of-speech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. We also show that a dictionary developed for a corpus on a certain domain is helpful for improving accuracy in analyzing a corpus on another domain.