Natural language parsing as statistical pattern recognition
Natural language parsing as statistical pattern recognition
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities
Computational Linguistics
Statistical methods for speech recognition
Statistical methods for speech recognition
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Exploiting syntactic structure for natural language modeling
Exploiting syntactic structure for natural language modeling
Computation of the probability of initial substring generation by stochastic context-free grammars
Computational Linguistics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Efficient probabilistic top-down and left-corner parsing
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Hi-index | 0.01 |
Recent contributions to statistical language modeling for speech recognition have shown that probabilistically parsing a partial word sequence aids the prediction of the next word, leading to "structured" language models that have the potential to outperform n-grams. Existing approaches to structured language modeling construct nodes in the partial parse tree after all of the underlying words have been predicted. This paper presents a different approach, based on probabilistic left-corner grammar (PLCG) parsing, that extends a partial parse both from the bottom up and from the top down, leading to a more focused and more accurate, though somewhat less robust, search of the parse space. At the core of our new structured language model is a fast context-sensitive and lexicalized PLCG parsing algorithm that uses dynamic programming. Preliminary perplexity and word-accuracy results appear to be competitive with previous ones, while speed is increased.