Parsing with Context-Free Grammars and Word Statistics

Authors:
Eugene Charniak
Affiliations:
-
Venue:
Parsing with Context-Free Grammars and Word Statistics
Year:
1995

Citing 0
Cited 9

Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Structural disambiguation of morpho-syntactic categorial parsing for Korean

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Inside-outside estimation of a lexicalized PCFG for German

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Parse forest computation of expected governors

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Overfitting avoidance for stochastic modeling of attribute-value grammars

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Adding predicate argument structure to the Penn TreeBank

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Tree-bank grammars

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Genre and domain in patent texts

PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
Inducing head-driven PCFGs with latent heads: refining a tree-bank grammar for parsing

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a language model in which the probability of a sentence is the sum of the individual parse probabilities, and these are calculated using a probabilistic context-free grammar (PCFG) plus statistics on individual words and how they fit into parses. We have used the model to improve syntactic disambiguation. After training on Wall Street Journal (WSJ) text we tested on about 200 WSJ sentence restricted to the 5400 most common words from our training. We observed a 41\ performance of our PCFG without the use of the word statistics.