Dependency structure analysis and sentence boundary detection in spontaneous Japanese

Authors:
Kazuya Shitaoka;Kiyotaka Uchimoto;Tatsuya Kawahara;Hitoshi Isahara
Affiliations:
Kyoto University, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan,;Kyoto University, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan,
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 7
Cited 3

A maximum entropy approach to identifying sentence boundaries

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Japanese dependency structure analysis based on maximum entropy models

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Using decision trees to construct a practical parser

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Stochastic dependency parsing of spontaneous Japanese spoken language

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Japanese dependency structure analysis based on support vector machines

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

Dependency parsing of Japanese spoken monologue based on clause boundaries

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A monotonic statistical machine translation approach to speaking style transformation

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a project to detect dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus. In monologues, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous. In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Japanese speech: One is based on statistical machine translation using dependency information and the other is based on text chunking using SVM. An F-measure of 84.9 was achieved for the accuracy of sentence boundary detection by using the proposed methods. The accuracy of dependency structure analysis was also improved from 75.2% to 77.2% by using automatically detected sentence boundaries. The accuracy of dependency structure analysis and that of sentence boundary detection were also improved by interactively using both automatically detected dependency structures and sentence boundaries.