Computation of the probability of initial substring generation by stochastic context-free grammars

  • Authors:
  • Frederick Jelinek;John D. Lafferty

  • Affiliations:
  • IBM T. J. Watson Research Center;IBM T. J. Watson Research Center

  • Venue:
  • Computational Linguistics
  • Year:
  • 1991

Quantified Score

Hi-index 0.01

Visualization

Abstract

Speech recognition language models are based on probabilities P(Wk+1 = v | w1w2,...,wk) that the next word Wk+1 will be any particular word v of the vocabulary, given that the word sequence w1,w2,...,wk is hypothesized to have been uttered in the past. If probabilistic context-free grammars are to be used as the basis of the language model, it will be necessary to compute the probability that successive application of the grammar rewrite rules (beginning with the sentence start symbol s) produces a word string whose initial substring is an arbitrary sequence w1,w2,...,wk+1. In this paper we describe a new algorithm that achieves the required computation in at most a constant times k3-steps.