A global optimization technique for statistical classifier design

Authors:
D. Miller;A.V. Rao;K. Rose;A. Gersho
Affiliations:
Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA;-;-;-
Venue:
IEEE Transactions on Signal Processing
Year:
1996

Citing 0
Cited 14

Combined learning and use for a mixture model equivalent to the RBF classifier

Neural Computation
A Deterministic Annealing Approach for Parsimonious Design of Piecewise Regression Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Stochastic Neural Model for Fast Identification of Spatiotemporal Sequences

Neural Processing Letters
Soft learning vector quantization

Neural Computation
Stochastic Organization of Output Codes in Multiclass Learning Problems

Neural Computation
Approximate Maximum Entropy Joint Feature Inference Consistent with Arbitrary Lower-Order Probability Constraints: Application to Statistical Classification

Neural Computation
Statistical Mechanical Analysis of Fuzzy Clustering Based on Fuzzy Entropy

IEICE - Transactions on Information and Systems
Learning vector quantization with adaptive prototype addition and removal

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A regularization framework for multiclass classification: A deterministic annealing approach

Pattern Recognition
Construction cosine radial basic function neural networks based on artificial immune networks

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Neuron selection for RBF neural network classifier based on multiple granularities immune network

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Annealed discriminant analysis

ECML'05 Proceedings of the 16th European conference on Machine Learning
Error-resilient and complexity-constrained distributed coding for large scale sensor networks

Proceedings of the 11th international conference on Information Processing in Sensor Networks
An adaptive classifier based on artificial immune network

LSMS'07 Proceedings of the 2007 international conference on Life System Modeling and Simulation

Quantified Score

Hi-index	35.68

Visualization

Abstract

A global optimization method is introduced that minimize the rate of misclassification. We first derive the theoretical basis for the method, on which we base the development of a novel design algorithm and demonstrate its effectiveness and superior performance in the design of practical classifiers for some of the most popular structures currently in use. The method, grounded in ideas from statistical physics and information theory, extends the deterministic annealing approach for optimization, both to incorporate structural constraints on data assignments to classes and to minimize the probability of error as the cost objective. During the design, data are assigned to classes in probability so as to minimize the expected classification error given a specified level of randomness, as measured by Shannon's entropy. The constrained optimization is equivalent to a free-energy minimization, motivating a deterministic annealing approach in which the entropy and expected misclassification cost are reduced with the temperature while enforcing the classifier's structure. In the limit, a hard classifier is obtained. This approach is applicable to a variety of classifier structures, including the widely used prototype-based, radial basis function, and multilayer perceptron classifiers. The method is compared with learning vector quantization, back propagation (BP), several radial basis function design techniques, as well as with paradigms for more directly optimizing all these structures to minimize probability of error. The annealing method achieves significant performance gains over other design methods on a number of benchmark examples from the literature, while often retaining design complexity comparable with or only moderately greater than that of strict descent methods. Substantial gains, both inside and outside the training set, are achieved for complicated examples involving high-dimensional data and large class overlap