A class-based language model for large-vocabulary speech recognition extracted from part-of-speech statistics

  • Authors:
  • C. Samuelsson;W. Reichl

  • Affiliations:
  • AT&TBell Labs., Murray Hill, NJ, USA;-

  • Venue:
  • ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel approach is presented to class-based language modeling based on part-of-speech statistics. It uses a deterministic word-to-class mapping, which handles words with alternative part-of-speech assignments through the use of ambiguity classes. The predictive power of word-based language models and the generalization capability of class-based language models are combined using both linear interpolation and word-to-class backoff, and both methods are evaluated. Since each word belongs to one precisely ambiguity class, an exact word-to-class backoff model can easily be constructed. Empirical evaluations on large-vocabulary speech-recognition tasks show perplexity improvements and significant reductions in word error-rate.