Class-based n-gram models of natural language
Computational Linguistics
Algorithms for bigram and trigram word clustering
Speech Communication
Similarity-Based Models of Word Cooccurrence Probabilities
Machine Learning - Special issue on natural language learning
A neural probabilistic language model
The Journal of Machine Learning Research
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Exploring asymmetric clustering for statistical language modeling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Word Categorization Using Clustering Ensemble
ICACTE '08 Proceedings of the 2008 International Conference on Advanced Computer Theory and Engineering
A graph-theoretic model of lexical syntactic acquisition
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A word clustering approach for language model-based sentence retrieval in question answering systems
Proceedings of the 18th ACM conference on Information and knowledge management
Phrase classes in two-level language models for ASR
Pattern Analysis & Applications
Optimizing language models for polarity classification
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Long distance bigram models applied to word clustering
Pattern Recognition
Cooccurrence smoothing for stochastic language modeling
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
This article investigates the effects of different degrees of contextual granularity on language model performance. It presents a new language model that combines clustering and half-contextualization, a novel representation of contexts. Half-contextualization is based on the half-context hypothesis that states that the distributional characteristics of a word or bigram are best represented by treating its context distribution to the left and right separately and that only directionally relevant distributional information should be used. Clustering is achieved using a new clustering algorithm for class-based language models that compares favorably to the exchange algorithm. When interpolated with a Kneser-Ney model, half-context models are shown to have better perplexity than commonly used interpolated n-gram models and traditional class-based approaches. A novel, fine-grained, context-specific analysis highlights those contexts in which the model performs well and those which are better treated by existing non-class-based models.