Statistical methods for speech recognition
Statistical methods for speech recognition
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Supertagging: an approach to almost parsing
Computational Linguistics
Exploiting syntactic structure for language modeling
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Language models based on semantic composition
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Hi-index | 0.00 |
Statistical language models using n-gram approach have been under the criticism of neglecting large-span syntactic-semantic information that influences the choice of the next word in a language. One of the approaches that helped recently is the use of latent semantic analysis to capture the semantic fabric of the document and enhance the n-gram model. Similarly there have been some approaches that used syntactic analysis to enhance the n-gram models. In this paper, we explain a framework called syntactically enhanced latent semantic analysis and its application in statistical language modeling. This approach augments each word with its syntactic descriptor in terms of the part-of-speech tag, phrase type or the supertag. We observe that given this syntactic knowledge, the model outperforms LSA based models significantly in terms of perplexity measure. We also present some observations on the effect of the knowledge of content or function word type in language modeling. This paper also poses the problem of better syntax prediction to achieve the benchmarks.