An algorithm for estimating the parameters of unrestricted hidden stochastic context-free grammars

  • Authors:
  • Julian Kupiec

  • Affiliations:
  • Xerox Palo Alto Research Center, Palo Alto, CA

  • Venue:
  • COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new algorithm is presented for estimating the parameters of a stochastic context-free grammar (SCFG) from ordinary unparsed text. Unlike the Inside/Outside (I/O) algorithm which requires a grammar to be specified in Chomsky normal form, the new algorithm can estimate an arbitrary SCFG without any need for transformation. The algorithm has worst-case cubic complexity in the length of a sentence and the number of nonterminals in the grammar. Instead of the binary branching tree structure used by the I/O algorithm, the new algorithm makes use of a trellis structure for computation. The trellis is a generalization of that used by the Baum-Welch algorithm which is used for estimating hidden stochastic regular grammars. The paper describes the relationship between the trellis and the more typical parse tree representation.