Surrogate maximization/minimization algorithms and extensions

Authors:
Zhihua Zhang;James T. Kwok;Dit-Yan Yeung
Affiliations:
Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
Venue:
Machine Learning
Year:
2007

Citing 12
Cited 5

Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
The Strength of Weak Learnability

Machine Learning
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Additive models, boosting, and inference for generalized divergences

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Maximum conditional likelihood via bound maximization and the CEM algorithm

Proceedings of the 1998 conference on Advances in neural information processing systems II
Logistic Regression, AdaBoost and Bregman Distances

Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Convex Optimization

Convex Optimization

Dictionary learning for sparse approximations with the majorization method

IEEE Transactions on Signal Processing
Recovering sparse signals with a certain family of nonconvex penalties and DC programming

IEEE Transactions on Signal Processing
Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models

The Journal of Machine Learning Research
Probabilistic classifiers with a generalized Gaussian scale mixture prior

Pattern Recognition
Sparse high-dimensional fractional-norm support vector machine via DC programming

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Surrogate maximization (or minimization) (SM) algorithms are a family of algorithms that can be regarded as a generalization of expectation-maximization (EM) algorithms. An SM algorithm aims at turning an otherwise intractable maximization problem into a tractable one by iterating two steps. The S-step computes a tractable surrogate function to substitute the original objective function and the M-step seeks to maximize this surrogate function. Convexity plays a central role in the S-step. SM algorithms enjoy the same convergence properties as EM algorithms. There are mainly three approaches to the construction of surrogate functions, namely, by using Jensen's inequality, first-order Taylor approximation, and the low quadratic bound principle. In this paper, we demonstrate the usefulness of SM algorithms by taking logistic regression models, AdaBoost and the log-linear model as examples. More specifically, by using different surrogate function construction methods, we devise several SM algorithms, including the standard SM, generalized SM, gradient SM, and quadratic SM algorithms, and their two variants called the conditional surrogate maximization (CSM) and surrogate conditional maximization (SCM) algorithms.