Testing the correlation of word error rate and perplexity

Authors:
Dietrich Klakow;Jochen Peters
Affiliations:
Philips GmbH Forschungslaboratorien, Weisshausstr.2, D-52066 Aachen, Germany;Philips GmbH Forschungslaboratorien, Weisshausstr.2, D-52066 Aachen, Germany
Venue:
Speech Communication
Year:
2002

Citing 2
Cited 5

A Cache-Based Natural Language Model for Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Handbook of mathematics (3rd ed.)

Handbook of mathematics (3rd ed.)

Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Simultaneous translation of lectures and speeches

Machine Translation
Performance prediction for exponential language models

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Topic-Dependent Language Model with Voting on Noun History

ACM Transactions on Asian Language Information Processing (TALIP)
On smoothing and inference for topic models

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many groups have investigated the relationship of word error rate and perplexity of language models. This issue is of central interest because perplexity optimization can be done independent of a recognizer and in most cases it is possible to find simple perplexity optimization procedures. Moreover, many tasks in language model training such as the optimization of word classes may use perplexity as target function resulting in explicit optimization formulas which are not available if error rates are used as target. This paper first presents some theoretical arguments for a close relationship between perplexity and word error rate. Thereafter the notion of uncertainty of a measurement is introduced and is then used to test the hypothesis that word error rate and perplexity are correlated by a power law. There is no evidence to reject this hypothesis.