Modeling of long distance context dependency

Authors:
Zhou GuoDong
Affiliations:
Institute for Infocomm Research, Singapore
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 6
Cited 2

Self-organized language modeling for speech recognition

Readings in speech recognition
Poor estimates of context are worse than none

HLT '90 Proceedings of the workshop on Speech and Natural Language
Fundamentals of speech recognition

Fundamentals of speech recognition
Class-based n-gram models of natural language

Computational Linguistics
Structural ambiguity and lexical relations

Computational Linguistics - Special issue on using large corpora: I
Word association and MI-Trigger-based language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2

Enhancing language models in statistical machine translation with backward n-grams and mutual information triggers

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Is the contextual information relevant in text clustering by compression?

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ngram models are simple in language modeling and have been successfully used in speech recognition and other tasks. However, they can only capture the short distance context dependency within an n-words window where currently the largest practical n for a natural language is three while much of the context dependency in a natural language occurs beyond a three words window. In order to incorporate this kind of long distance context dependency in the ngram model of our Mandarin speech recognition system, this paper proposes a novel MI-Ngram modeling approach. This new MI-Ngram model consists of two components: a normal ngram model and a novel MI model. The ngram model captures the short distance context dependency within an n-words window while the MI model captures the context dependency between the word pairs over a long distance by using the concept of mutual information. That is, the MI-Ngram model incorporates the word occurrences beyond the scope of the normal ngram model. It is found that MI-Ngram modeling has much better performance than the normal word ngram modeling. Experimentation shows that about 20% of errors can be corrected by using a MI-Trigram model compared with the pure word trigram model.