A maximum entropy approach to natural language processing
Computational Linguistics
A neural probabilistic language model
The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Discriminative n-gram language modeling
Computer Speech and Language
A stochastic memoizer for sequence data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Hi-index | 0.00 |
We present a novel probabilistic classifier, which scales well to problems that involve a large number of classes and require training on large datasets. A prominent example of such a problem is language modeling. Our classifier is based on the assumption that each feature is associated with a predictive strength, which quantifies how well the feature can predict the class by itself. The predictions of individual features can then be combined according to their predictive strength, resulting in a model, whose parameters can be reliably and efficiently estimated. We show that a generative language model based on our classifier consistently matches modified Kneser-Ney smoothing and can outperform it if sufficiently rich features are incorporated.