A comparison of PCFG models

Authors:
Jose Luis Verdú-Mas;Jorge Calera-Rubio;Rafael C. Carrasco
Affiliations:
Universitat d'Alacant, Alacant, Spain;Universitat d'Alacant, Alacant, Spain;Universitat d'Alacant, Alacant, Spain
Venue:
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Year:
2000

Citing 6
Cited 2

Class-based n-gram models of natural language

Computational Linguistics
Minimal Ascending and Descending Tree Automata

SIAM Journal on Computing
On the Estimation of 'Small' Probabilities by Leaving-One-Out

IEEE Transactions on Pattern Analysis and Machine Intelligence
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations

Computational Linguistics
Tree-bank grammars

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

PCFG Learning by Nonterminal Partition Search

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Learning grammars for different parsing tasks by partition search

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we compare three different approaches to build a probabilistic context-free grammar for natural language parsing from a tree bank corpus: 1) a model that simply extracts the rules contained in the corpus and counts the number of occurrences of each rule 2) a model that also stores information about the parent node's category and, 3) a model that estimates the probabilities according to a generalized k-gram scheme with k = 3. The last one allows for a faster parsing and decreases the perplexity of test samples.