NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Topic modeling in fringe word prediction for AAC
Proceedings of the 11th international conference on Intelligent user interfaces
Detection of language (model) errors
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Language model adaptation for statistical machine translation with structured query models
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Corpus studies in word prediction
Proceedings of the 9th international ACM SIGACCESS conference on Computers and accessibility
Web resources for language modeling in conversational speech recognition
ACM Transactions on Speech and Language Processing (TSLP)
Adapting word prediction to subject matter without topic-labeled data
Proceedings of the 10th international ACM SIGACCESS conference on Computers and accessibility
Adaptive language modeling for word prediction
HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
A multiple classifier approach to detect Chinese character recognition errors
Pattern Recognition
A topic identification task for modern standard Arabic
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Hi-index | 0.00 |
N-gram language models are frequently used by the speech recognition systems to constrain and guide the search. N-gram models use only the last N-1 words to predict the next word. Typical values of N that are used range from 2-4. N-gram language models thus lack the long-term context information. We show that the predictive power of the N-gram language models can be improved by using long-term context information about the topic of discussion. We use information retrieval techniques to generalize the available context information for topic-dependent language modeling. We demonstrate the effectiveness of this technique by performing experiments on the Wall Street Journal text corpus, which is a relatively difficult task for topic-dependent language modeling since the text is relatively homogeneous. The proposed method can reduce the perplexity of the baseline language model by 37%, indicating the predictive power of the topic-dependent language model.