Estimating Grammar Parameters Using Bounded Memory

  • Authors:
  • Tim Oates;Brent Heeringa

  • Affiliations:
  • -;-

  • Venue:
  • ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Estimating the parameters of stochastic context-free grammars (SCFGs) from data is an important, well-studied problem. Almost without exception, existing approaches make repeated passes over the training data. The memory requirements of such algorithms are ill-suited for embedded agents exposed to large amounts of training data over long periods of time. We present a novel algorithm, called HOLA, for estimating the parameters of SCFGs that computes summary statistics for each string as it is observed and then discards the string. The memory used by HOLA is bounded by the size of the grammar, not by the amount of training data. Empirical results show that HOLA performs as well as the Inside-Outside algorithm on a variety of standard problems, despite the fact that it has access to much less information.