A stochastic parser based on an SLM with arboreal context trees

  • Authors:
  • Shinsuke Mori

  • Affiliations:
  • IBM Research, Japan

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a parser based on a stochastic structured language model (SLM) with a flexible history reference mechanism. An SLM is an alternative to an n-gram model as a language model for a speech recognizer. The advantage of an SLM against an n-gram model is the ability to return the structure of a given sentence. Thus SLMs are expected to play an important part in spoken language understanding systems. The current SLMs refer to a fixed part of the history for prediction just like an n-gram model. We introduce a flexible history reference mechanism called an ACT (arboreal context tree; an extension of the context tree to tree-shaped histories) and describe a parser based on an SLM with ACTs. In the experiment, we built an SLM-based parser with a fixed history and one with ACTs, and compared their parsing accuracies. The accuracy of our parser was 92.8%, which was higher than that for the parser with the fixed history (89.8%). This result shows that the flexible history reference mechanism improves the parsing ability of an SLM, which has great importance for language understanding.