Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Structural disambiguation of morpho-syntactic categorial parsing for Korean
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Inside-outside estimation of a lexicalized PCFG for German
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Parse forest computation of expected governors
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Overfitting avoidance for stochastic modeling of attribute-value grammars
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Adding predicate argument structure to the Penn TreeBank
HLT '02 Proceedings of the second international conference on Human Language Technology Research
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Genre and domain in patent texts
PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
Inducing head-driven PCFGs with latent heads: refining a tree-bank grammar for parsing
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
We present a language model in which the probability of a sentence is the sum of the individual parse probabilities, and these are calculated using a probabilistic context-free grammar (PCFG) plus statistics on individual words and how they fit into parses. We have used the model to improve syntactic disambiguation. After training on Wall Street Journal (WSJ) text we tested on about 200 WSJ sentence restricted to the 5400 most common words from our training. We observed a 41\ performance of our PCFG without the use of the word statistics.