An Efficiently Focusing Large Vocabulary Language Model

Authors:
Mikko Kurimo;Krista Lagus
Affiliations:
-;-
Venue:
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Year:
2002

Citing 4
Cited 2

Self-Organizing Maps

Self-Organizing Maps
Text Retrieval Using Self-Organized Document Maps

Neural Processing Letters
Language Model Adaptation Using Mixtures and an Exponentially Decaying Cache

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Self organization of a massive document collection

IEEE Transactions on Neural Networks

Mining massive document collections by the WEBSOM method

Information Sciences: an International Journal - Special issue: Soft computing data mining
Topic identification in natural language dialogues using neural networks

SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate statistical language models are needed, for example, for large vocabulary speech recognition. The construction of models that are computationally efficient and able to utilize long-term dependencies in the data is a challenging task. In this article we describe how a topical clustering obtained by ordered maps of document collections can be utilized for the construction of efficiently focusing statistical language models. Experiments on Finnish and English texts demonstrate that considerable improvements are obtained in perplexity compared to a general n-gram model and to manually classified topic categories. In the speech recognition task the recognition history and the current hypothesis can be utilized to focus the model towards the current discourse or topic, and then apply the focused model to re-rank the hypothesis.