A class-based language model for large-vocabulary speech recognition extracted from part-of-speech statistics

Authors:
C. Samuelsson;W. Reichl
Affiliations:
AT&TBell Labs., Murray Hill, NJ, USA;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Year:
1999

Citing 0
Cited 5

New models for improving supertag disambiguation

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A salience driven approach to robust input interpretation in multimodal conversational systems

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Integrating multi-level linguistic knowledge with a unified framework for Mandarin speech recognition

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A word clustering approach for language model-based sentence retrieval in question answering systems

Proceedings of the 18th ACM conference on Information and knowledge management
Estimation of stochastic context-free grammars and their use as language models

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

A novel approach is presented to class-based language modeling based on part-of-speech statistics. It uses a deterministic word-to-class mapping, which handles words with alternative part-of-speech assignments through the use of ambiguity classes. The predictive power of word-based language models and the generalization capability of class-based language models are combined using both linear interpolation and word-to-class backoff, and both methods are evaluated. Since each word belongs to one precisely ambiguity class, an exact word-to-class backoff model can easily be constructed. Empirical evaluations on large-vocabulary speech-recognition tasks show perplexity improvements and significant reductions in word error-rate.