Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language
Computational Linguistics
Minimal Ascending and Descending Tree Automata
SIAM Journal on Computing
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Stochastic Inference of Regular Tree Languages
Machine Learning
Probabilistic k-Testable Tree Languages
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
Compacting the Penn Treebank grammar
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Smoothing and compression with stochastic k-testable tree languages
Pattern Recognition
Hi-index | 0.00 |
In this paper, we compare three different approaches to build a probabilistic context-free grammar for natural language parsing from a tree bank corpus: (1) a model that simply extracts the rules contained in the corpus and counts the number of occurrences of each rule; (2) a model that also stores information about the parent node's category, and (3) a model that estimates the probabilities according to a generalized k-gram scheme for trees with k = 3. The last model allows for faster parsing and decreases considerably the perplexity of test samples.