A General Technique to Train Language Models on Language Models

Authors:
Mark-Jan Nederhof
Affiliations:
-
Venue:
Computational Linguistics
Year:
2005

Citing 9
Cited 7

On multiple context-free grammars

Theoretical Computer Science
Foundations of statistical natural language processing

Foundations of statistical natural language processing
The Theory of Parsing, Translation, and Compiling

The Theory of Parsing, Translation, and Compiling
Practical experiments with regular approximation of context-free languages

Computational Linguistics - Special issue on finite-state methods in NLP
Finite-state transducers in language and speech processing

Computational Linguistics
The use of shared forests in tree adjoining grammar parsing

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
The recognition capacity of local syntactic constraints

EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
Precise n-gram probabilities from stochastic context-free grammars

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Introduction to probabilistic automata (Computer science and applied mathematics)

Introduction to probabilistic automata (Computer science and applied mathematics)

Cross-entropy and estimation of probabilistic context-free grammars

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Probabilistic Context-Free Grammars Estimated from Infinite Distributions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computation of distances for regular and context-free probabilistic languages

Theoretical Computer Science
Variational decoding for statistical machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Weighted parsing of trees

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Intersection for weighted formalisms

FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
A practical algorithm for intersecting weighted context-free grammars with finite-state automata

FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that under certain conditions, a language model can be trained on the basis of a second language model. The main instance of the technique trains a finite automaton on the basis of a probabilistic context-free grammar, such that the Kullback-Leibler distance between grammar and trained automaton is provably minimal. This is a substantial generalization of an existing algorithm to train an n-gram model on the basis of a probabilistic context-free grammar.