Class-based n-gram models of natural language
Computational Linguistics
Minimal Ascending and Descending Tree Automata
SIAM Journal on Computing
On the Estimation of 'Small' Probabilities by Leaving-One-Out
IEEE Transactions on Pattern Analysis and Machine Intelligence
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
PCFG Learning by Nonterminal Partition Search
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Learning grammars for different parsing tasks by partition search
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
In this paper, we compare three different approaches to build a probabilistic context-free grammar for natural language parsing from a tree bank corpus: 1) a model that simply extracts the rules contained in the corpus and counts the number of occurrences of each rule 2) a model that also stores information about the parent node's category and, 3) a model that estimates the probabilities according to a generalized k-gram scheme with k = 3. The last one allows for a faster parsing and decreases the perplexity of test samples.