Training connectionist models for the structured language model

Authors:
Peng Xu;Ahmad Emami;Frederick Jelinek
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD
Venue:
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Year:
2003

Citing 7
Cited 6

A maximum entropy approach to natural language processing

Computational Linguistics
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Robust probabilistic predictive syntactic processing: motivations, models, and applications

Robust probabilistic predictive syntactic processing: motivations, models, and applications
A neural probabilistic language model

The Journal of Machine Learning Research
Neural network probability estimation for broad coverage parsing

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A study on richer syntactic dependencies for structured language modeling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

A Neural Syntactic Language Model

Machine Learning
Discriminative syntactic language modeling for speech recognition

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Modeling Topic and Role Information in Meetings Using the Hierarchical Dirichlet Process

MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Factored neural language models

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
Hierarchical Bayesian language models for conversational speech recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the performance of the Structured Language Model (SLM) in terms of perplexity (PPL) when its components are modeled by connectionist models. The connectionist models use a distributed representation of the items in the history and make much better use of contexts than currently used interpolated or back-off models, not only because of the inherent capability of the connectionist model in fighting the data sparseness problem, but also because of the sublinear growth in the model size when the context length is increased. The connectionist models can be further trained by an EM procedure, similar to the previously used procedure for training the SLM. Our experiments show that the connectionist models can significantly improve the PPL over the interpolated and back-off models on the UPENN Treebank corpora, after interpolating with a baseline trigram language model. The EM training procedure can improve the connectionist models further, by using hidden events obtained by the SLM parser.