Using a partially annotated corpus to build a dependency parser for japanese

Authors:
Manabu Sassano
Affiliations:
Fujitsu Laboratories, Ltd., Kawasaki, Japan
Venue:
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Year:
2005

Citing 15
Cited 1

The nature of statistical learning theory

The nature of statistical learning theory
Active Learning for Natural Language Parsing and Information Extraction

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Text classification using string kernels

The Journal of Machine Learning Research
Japanese dependency structure analysis based on maximum entropy models

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Bunsetsu identification using category-exclusive rules

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
CTM: an example-based translation aid system

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
Backward beam search algorithm for dependency analysis of Japanese

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Japanese dependency analysis using a deterministic finite state transducer

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Active learning for statistical natural language parsing

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Parsing the wall street journal using a Lexical-Functional Grammar and discriminative estimation techniques

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Japanese dependency structure analysis based on support vector machines

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Japanese dependency analysis using cascaded chunking

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Linear-time dependency analysis for Japanese

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Profiting from mark-up: hyper-text annotations for guided parsing

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore the use of a partially annotated corpus to build a dependency parser for Japanese. We examine two types of partially annotated corpora. It is found that a parser trained with a corpus that does not have any grammatical tags for words can demonstrate an accuracy of 87.38%, which is comparable to the current state-of-the-art accuracy on the Kyoto University Corpus. In contrast, a parser trained with a corpus that has only dependency annotations for each two adjacent bunsetsus (chunks) shows moderate performance. Nonetheless, it is notable that features based on character n-grams are found very useful for a dependency parser for Japanese.