Dependency structure analysis and sentence boundary detection in spontaneous Japanese

  • Authors:
  • Kazuya Shitaoka;Kiyotaka Uchimoto;Tatsuya Kawahara;Hitoshi Isahara

  • Affiliations:
  • Kyoto University, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan,;Kyoto University, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan,

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a project to detect dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus. In monologues, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous. In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Japanese speech: One is based on statistical machine translation using dependency information and the other is based on text chunking using SVM. An F-measure of 84.9 was achieved for the accuracy of sentence boundary detection by using the proposed methods. The accuracy of dependency structure analysis was also improved from 75.2% to 77.2% by using automatically detected sentence boundaries. The accuracy of dependency structure analysis and that of sentence boundary detection were also improved by interactively using both automatically detected dependency structures and sentence boundaries.