Best-first word-lattice parsing: techniques for integrated syntactic language modeling

  • Authors:
  • Mark Johnson;Keith B. Hall

  • Affiliations:
  • Brown University;Brown University

  • Venue:
  • Best-first word-lattice parsing: techniques for integrated syntactic language modeling
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This thesis explores a language modeling technique based on statistical parsing. Previous research that exploits syntactic structure for modeling language has shown improved accuracy over the standard trigram models. Unlike previous techniques, our parsing model performs syntactic analysis on sets of hypothesized word-strings simultaneously; these sets are encoded as weighted finite state automata word-lattices. We present a best-first word-lattice chart parsing algorithm which combines the search for good parses with the search for good strings in the word-lattice. We describe how the word-lattice parser is combined with the Charniak language model, a sophisticated syntactic language model, in order to provide an efficient syntactic language model. We present results for this model on a standard set of speech recognition word-lattices. Finally, we examine variations of the word-lattice parser in order to increase performance as well as accuracy.