A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Logistic Regression, AdaBoost and Bregman Distances
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Convex Optimization
A maximum entropy approach to species distribution modeling
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Adaptive language modeling using the maximum entropy principle
HLT '93 Proceedings of the workshop on Human Language Technology
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Transductive Methods for the Distributed Ensemble Classification Problem
Neural Computation
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Improved iterative scaling (IIS) is an algorithm for learning maximum entropy (ME) joint and conditional probability models, consistent with specified constraints, that has found great utility in natural language processing and related applications. In most IIS work on classification, discrete-valued "feature functions" are considered, depending on the data observations and class label, with constraints measured based on frequency counts, taken over hard (0---1) training set instances. Here, we consider the case where the training (and test) set consist of instances of probability mass functions on the features, rather than hard feature values. IIS extends in a natural way for this case. This has applications (1) to ME classification on mixed discrete-continuous feature spaces and (2) to ME aggregation of soft classifier decisions in ensemble classification. Moreover, we combine these methods, yielding a method, with proved learning convergence, that jointly performs (soft) decision-level and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against standard Adaboost.M1, input-dependent boosting, and other supervised combining methods, on data sets from the UC Irvine Machine Learning repository.