Statistical language modeling combining N-gram and context-free grammars

  • Authors:
  • Marie Meteer;J. Robin Rohlicek

  • Affiliations:
  • Rensselaer Polytechnic Institute, Troy, NY;BBN Systems and Technologies, Cambridge, Massachusett

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

While statistically based Markov-chain language models (N-gram models) have been shown to be effective for speech recognition, there is, in general, more stmcture present in natural language than N-gram models can capture. Linguistically based approaches that use statistics to provide probabilities for word sequences that are accepted by a grammar, typically require a full coverage grammar, and therefore are only useful for constrained sublanguages. In the work presented here, we combine linguistic structure in the form of a partial-coverage phrase structure grammar with statistical N-gram techniques. The result is a robust statistical grammar which explicitly incorporates linguistic and semantic structure. We are applying this approach to the recognition of air-traffic-control transmissions and have already shown that a simpler hybrid approach is useful. This work extends those preliminary results to a more general framework.