Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Probabilistic top-down parsing and language modeling
Computational Linguistics
PCFG models of linguistic tree representations
Computational Linguistics
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A study on richer syntactic dependencies for structured language modeling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Best-first word-lattice parsing: techniques for integrated syntactic language modeling
Best-first word-lattice parsing: techniques for integrated syntactic language modeling
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Adapting a WSJ-trained parser to grammatically noisy text
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Joshua: an open source toolkit for parsing-based machine translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
A large scale distributed syntactic, semantic and lexical language model for machine translation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Faster and smaller N-gram language models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Rule Markov models for fast tree-to-string translation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Judging grammaticality with tree substitution grammar derivations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
KenLM: faster and smaller language model queries
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Hi-index | 0.00 |
We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition on overlapping windows of linear context. We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a matter of hours. We evaluate on perplexity and a range of grammaticality tasks, and find that we perform as well or better than n-gram models and other generative baselines. Our model even competes with state-of-the-art discriminative models hand-designed for the grammaticality tasks, despite training on positive data alone. We also show fluency improvements in a preliminary machine translation experiment.