Cross-entropy and estimation of probabilistic context-free grammars

Authors:
Anna Corazza;Giorgio Satta
Affiliations:
University "Federico II", via Cinthia, Napoli, Italy;University of Padua, via Gradenigo, Padova, Italy
Venue:
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Year:
2006

Citing 11
Cited 3

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Probabilistic top-down parsing and language modeling

Computational Linguistics
Estimation of probabilistic context-free grammars

Computational Linguistics
Statistical properties of probabilistic context-free grammars

Computational Linguistics
Exploiting syntactic structure for language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Stochastic lexicalized tree-adjoining grammars

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Head-Driven Statistical Models for Natural Language Parsing

Computational Linguistics
A General Technique to Train Language Models on Language Models

Computational Linguistics
Introduction to Automata Theory, Languages, and Computation (3rd Edition)

Introduction to Automata Theory, Languages, and Computation (3rd Edition)
Kullback-Leibler distance between probabilistic context-free grammars and probabilistic finite automata

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Estimation of consistent probabilistic context-free grammars

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Learning and inference for hierarchically split PCFGs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Maximum likelihood analysis of algorithms and data structures

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the problem of training probabilistic context-free grammars on the basis of a distribution defined over an infinite set of trees, by minimizing the cross-entropy. This problem can be seen as a generalization of the well-known maximum likelihood estimator on (finite) tree banks. We prove an unexpected theoretical property of grammars that are trained in this way, namely, we show that the derivational entropy of the grammar takes the same value as the cross-entropy between the input distribution and the grammar itself. We show that the result also holds for the widely applied maximum likelihood estimator on tree banks.