A stochastic parser based on a structural word prediction model

  • Authors:
  • Shinsuke Mori;Masafumi Nishimura;Nobuyasu Itoh;Shiho Ogino;Hideo Watanabe

  • Affiliations:
  • IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd: Shimotsuruma Yamatoshi, Japan;IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd: Shimotsuruma Yamatoshi, Japan;IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd: Shimotsuruma Yamatoshi, Japan;IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd: Shimotsuruma Yamatoshi, Japan;IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd: Shimotsuruma Yamatoshi, Japan

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a stochastic language model using dependency. This model considers a sentence as a word sequence and predicts each word from left to right. The history at each step of prediction is a sequence of partial parse trees covering the preceding words. First our model predicts the partial parse trees which have a dependency relation with the next word among them and then predicts the next word from only the trees which have a dependency relation with the next word. Our model is a generative stochastic model, thus this can be used not only as a parser but also as a language model of a speech recognizer. In our experiment, we prepared about 1,000 syntactically annotated Japanese sentences extracted from a financial newspaper and estimated the parameters of our model. We built a parser based on our model and tested it on approximately 100 sentences of the same newspaper. The accuracy of the dependency relation was 89.9%, the highest accuracy level obtained by Japanese stochastic parsers.