The EM algorithm for graphical association models with missing data
Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical methods for speech recognition
Statistical methods for speech recognition
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A neural probabilistic language model
The Journal of Machine Learning Research
Augmenting Naive Bayes Classifiers with Statistical Language Models
Information Retrieval
Probabilistic top-down parsing and language modeling
Computational Linguistics
Stochastic attribute-value grammars
Computational Linguistics
A Mathematical Theory of Communication
A Mathematical Theory of Communication
Graphical Models, Exponential Families, and Variational Inference
Graphical Models, Exponential Families, and Variational Inference
The Latent Maximum Entropy Principle
ACM Transactions on Knowledge Discovery from Data (TKDD)
Learning mixture models with the regularized latent maximum entropy principle
IEEE Transactions on Neural Networks
ICML '05 Proceedings of the 22nd international conference on Machine learning
A novel text modeling approach for structural comparison and alignment of biomolecules
WSEAS Transactions on Computers
The Latent Maximum Entropy Principle
ACM Transactions on Knowledge Discovery from Data (TKDD)
A scalable distributed syntactic, semantic, and lexical language model
Computational Linguistics
Hi-index | 0.00 |
We present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Our approach is based on a recent statistical inference principle we have proposed--the latent maximum entropy principle--which allows relationships over hidden features to be effectively captured in a unified model. Our work extends previous research on maximum entropy methods for language modeling, which only allow observed features to be modeled. The ability to conveniently incorporate hidden variables allows us to extend the expressiveness of language models while alleviating the necessity of pre-processing the data to obtain explicitly observed features. We describe efficient algorithms for marginalization, inference and normalization in our extended models. We then use these techniques to combine two standard forms of language models: local lexical models (Markov N-gram models) and global document-level semantic models (probabilistic latent semantic analysis). Our experimental results on the Wall Street Journal corpus show that we obtain a 18.5% reduction in perplexity compared to the baseline tri-gram model with Good-Turing smoothing.