Refining generative language models using discriminative learning

Authors:
Ben Sandbank
Affiliations:
Tel-Aviv University, Tel-Aviv, Israel
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 10
Cited 1

The ATIS spoken language systems pilot corpus

HLT '90 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language

Computational Linguistics
Improvement of a Whole Sentence Maximum Entropy Language Model using grammatical features

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A fast learning algorithm for deep belief nets

Neural Computation
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Minimum sample risk methods for language modeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discriminative n-gram language modeling

Computer Speech and Language
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Efficient sampling and feature selection in whole sentence maximum entropy language models

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Three new graphical models for statistical language modelling

Proceedings of the 24th international conference on Machine learning

Acoustically discriminative language model training with pseudo-hypothesis

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new approach to language modeling which utilizes discriminative learning methods. Our approach is an iterative one: starting with an initial language model, in each iteration we generate 'false' sentences from the current model, and then train a classifier to discriminate between them and sentences from the training corpus. To the extent that this succeeds, the classifier is incorporated into the model by lowering the probability of sentences classified as false, and the process is repeated. We demonstrate the effectiveness of this approach on a natural language corpus and show it provides an 11.4% improvement in perplexity over a modified kneser-ney smoothed trigram.