An Extension of Iterative Scaling for Decision and Data Aggregation in Ensemble Classification

Authors:
Siddharth Pal;David J. Miller
Affiliations:
Department of Electrical Engineering, The Pennsylvania State University, University Park, USA 16802-2701;Department of Electrical Engineering, The Pennsylvania State University, University Park, USA 16802-2701
Venue:
Journal of VLSI Signal Processing Systems
Year:
2007

Citing 16
Cited 1

A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal approximation of discrete probability distribution with kth-order dependency and its application to combining multiple classifiers

Pattern Recognition Letters
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Localized Boosting

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Logistic Regression, AdaBoost and Bregman Distances

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Convex Optimization

Convex Optimization
A maximum entropy approach to species distribution modeling

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Adaptive language modeling using the maximum entropy principle

HLT '93 Proceedings of the workshop on Human Language Technology
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms

Neural Computation
Approximate Maximum Entropy Joint Feature Inference Consistent with Arbitrary Lower-Order Probability Constraints: Application to Statistical Classification

Neural Computation
Transductive Methods for the Distributed Ensemble Classification Problem

Neural Computation
General statistical inference for discrete and mixed spaces by an approximate application of the maximum entropy principle

IEEE Transactions on Neural Networks

Extensions of transductive learning for distributed ensemble classification and application to biometric authentication

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Improved iterative scaling (IIS) is an algorithm for learning maximum entropy (ME) joint and conditional probability models, consistent with specified constraints, that has found great utility in natural language processing and related applications. In most IIS work on classification, discrete-valued "feature functions" are considered, depending on the data observations and class label, with constraints measured based on frequency counts, taken over hard (0---1) training set instances. Here, we consider the case where the training (and test) set consist of instances of probability mass functions on the features, rather than hard feature values. IIS extends in a natural way for this case. This has applications (1) to ME classification on mixed discrete-continuous feature spaces and (2) to ME aggregation of soft classifier decisions in ensemble classification. Moreover, we combine these methods, yielding a method, with proved learning convergence, that jointly performs (soft) decision-level and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against standard Adaboost.M1, input-dependent boosting, and other supervised combining methods, on data sets from the UC Irvine Machine Learning repository.