Self-organized language modeling for speech recognition
Readings in speech recognition
Toward a unified approach to statistical language modeling for Chinese
ACM Transactions on Asian Language Information Processing (TALIP)
Distribution-based pruning of backoff language models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Discriminative pruning of language models for Chinese word segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
How many bits are needed to store probabilities for phrase-based translation?
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
A chinese mobile phone input method based on the dynamic and self-study language model
EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
ACM Transactions on Asian Language Information Processing (TALIP)
Unsupervised language model adaptation for handwritten Chinese text recognition
Pattern Recognition
Hi-index | 0.00 |
Reducing language model (LM) size is a critical issue when applying a LM to realistic applications which have memory constraints. In this paper, three measures are studied for the purpose of LM pruning. They are probability, rank, and entropy. We evaluated the performance of the three pruning criteria in a real application of Chinese text input in terms of character error rate (CER). We first present an empirical comparison, showing that rank performs the best in most cases. We also show that the high-performance of rank lies in its strong correlation with error rate. We then present a novel method of combining two criteria in model pruning. Experimental results show that the combined criterion consistently leads to smaller models than the models pruned using either of the criteria separately, at the same CER.