A comparison of PCFG models

  • Authors:
  • Jose Luis Verdú-Mas;Jorge Calera-Rubio;Rafael C. Carrasco

  • Affiliations:
  • Universitat d'Alacant, Alacant, Spain;Universitat d'Alacant, Alacant, Spain;Universitat d'Alacant, Alacant, Spain

  • Venue:
  • ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we compare three different approaches to build a probabilistic context-free grammar for natural language parsing from a tree bank corpus: 1) a model that simply extracts the rules contained in the corpus and counts the number of occurrences of each rule 2) a model that also stores information about the parent node's category and, 3) a model that estimates the probabilities according to a generalized k-gram scheme with k = 3. The last one allows for a faster parsing and decreases the perplexity of test samples.