Weighted and probabilistic context-free grammars are equally expressive

Authors:
Noah A. Smith;Mark Johnson
Affiliations:
-;-
Venue:
Computational Linguistics
Year:
2007

Citing 0
Cited 6

A noisy-channel model of rational human sentence comprehension under uncertain input

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Sparse multi-scale grammars for discriminative latent variable parsing

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Painless unsupervised learning with features

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Viterbi training for PCFGs: hardness results and competitiveness of uniform initialization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Integrating surprisal and uncertain-input models in online sentence comprehension: formal techniques and empirical results

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
The generative power of probabilistic and weighted context-free grammars

MOL'11 Proceedings of the 12th biennial conference on The mathematics of language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article studies the relationship between weighted context-free grammars (WCFGs), where each production is associated with a positive real-valued weight, and probabilistic context-free grammars (PCFGs), where the weights of the productions associated with a nonterminal are constrained to sum to one. Because the class of WCFGs properly includes the PCFGs, one might expect that WCFGs can describe distributions that PCFGs cannot. However, Z. Chi (1999, Computational Linguistics, 25(1):131--160) and S. P. Abney, D. A. McAllester, and P. Pereira (1999, In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 542--549, College Park, MD) proved that every WCFG distribution is equivalent to some PCFG distribution. We extend their results to conditional distributions, and show that every WCFG conditional distribution of parses given strings is also the conditional distribution defined by some PCFG, even when the WCFG's partition function diverges. This shows that any parsing or labeling accuracy improvement from conditional estimation of WCFGs or conditional random fields (CRFs) over joint estimation of PCFGs or hidden Markov models (HMMs) is due to the estimation procedure rather than the change in model class, because PCFGs and HMMs are exactly as expressive as WCFGs and chain-structured CRFs, respectively.