Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models

Authors:
Fang-Lan Huang;Cho-Jui Hsieh;Kai-Wei Chan;Chih-Jen Lin
Affiliations:
-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 18
Cited 3

Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
On the convergence of the coordinate descent method for convex differentiable minimization

Journal of Optimization Theory and Applications
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Logistic Regression, AdaBoost and Bregman Distances

Machine Learning
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
Sequential conditional Generalized Iterative Scaling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A Fast Dual Algorithm for Kernel Logistic Regression

Machine Learning
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

The Journal of Machine Learning Research
Surrogate maximization/minimization algorithms and extensions

Machine Learning
Trust Region Newton Method for Logistic Regression

The Journal of Machine Learning Research
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

The Journal of Machine Learning Research
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

The Journal of Machine Learning Research
Iterative scaling and coordinate descent methods for maximum entropy

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification

The Journal of Machine Learning Research
The Latent Maximum Entropy Principle

ACM Transactions on Knowledge Discovery from Data (TKDD)
An improved GLMNET for L1-regularized logistic regression

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Maximum entropy (Maxent) is useful in natural language processing and many other areas. Iterative scaling (IS) methods are one of the most popular approaches to solve Maxent. With many variants of IS methods, it is difficult to understand them and see the differences. In this paper, we create a general and unified framework for iterative scaling methods. This framework also connects iterative scaling and coordinate descent methods. We prove general convergence results for IS methods and analyze their computational complexity. Based on the proposed framework, we extend a coordinate descent method for linear SVM to Maxent. Results show that it is faster than existing iterative scaling methods.