A Cache-Based Natural Language Model for Speech Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
A dynamic language model for speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
Statistical methods for speech recognition
Statistical methods for speech recognition
Language Model Adaptation Using Mixtures and an Exponentially Decaying Cache
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
This paper investigates a variety of statistical cache-basedlanguage models built upon three corpora: English, Lithuanian, andLithuanian base forms. The impact of the cache size, type of thedecay function, including custom corpus derived functions, andinterpolation technique (static vs. dynamic) on the perplexity of alanguage model is studied. The best results are achieved by modelsconsisting of 3 components: standard 3-gram, decaying cache 1-gramand decaying cache 2-gram that are joined together by means oflinear interpolation using the technique of dynamic weight update.Such a model led up to 36% and 43% perplexity improvement withrespect to the 3-gram baseline for Lithuanian words and Lithuanianword base forms respectively. The best language model of Englishled up to a 16% perplexity improvement. This suggests thatcache-based modeling is of greater utility for the free word orderhighly inflected languages.