Unifying divergence minimization and statistical inference via convex duality

Authors:
Yasemin Altun;Alex Smola
Affiliations:
Toyota Technological Institute at Chicago, Chicago, IL;National ICT Australia, Canberra, ACT, Australia
Venue:
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Year:
2006

Citing 11
Cited 15

Support-Vector Networks

Machine Learning
Additive models, boosting, and inference for generalized divergences

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Logistic Regression, AdaBoost and Bregman Distances

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Stability and generalization

The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Exponential families for conditional random fields

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Heteroscedastic Gaussian process regression

ICML '05 Proceedings of the 22nd international conference on Machine learning
Maximum entropy distribution estimation with generalized regularization

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Sequential greedy approximation for certain convex optimization problems

IEEE Transactions on Information Theory
On minimizing distortion and relative entropy

IEEE Transactions on Information Theory

Value Regularization and Fenchel Duality

The Journal of Machine Learning Research
Classifying matrices with a spectral regularization

Proceedings of the 24th international conference on Machine learning
Causal reasoning by evaluating the complexity of conditional densities with kernel methods

Neurocomputing
Estimating labels from label proportions

Proceedings of the 25th international conference on Machine learning
Tailoring density estimation via reproducing kernel moment matching

Proceedings of the 25th international conference on Machine learning
A Hilbert Space Embedding for Distributions

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Estimating Labels from Label Proportions

The Journal of Machine Learning Research
Hash Kernels for Structured Data

The Journal of Machine Learning Research
Alternating projections for learning with expectation constraints

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm

ACM Transactions on Algorithms (TALG)
Unsupervised transfer classification: application to text categorization

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Information, Divergence and Risk for Binary Experiments

The Journal of Machine Learning Research
Maximum entropy distribution estimation with generalized regularization

COLT'06 Proceedings of the 19th annual conference on Learning Theory
A kernel two-sample test

The Journal of Machine Learning Research
Review: Divergence measures for statistical data processing-An annotated bibliography

Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we unify divergence minimization and statistical inference by means of convex duality. In the process of doing so, we prove that the dual of approximate maximum entropy estimation is maximum a posteriori estimation as a special case. Moreover, our treatment leads to stability and convergence bounds for many statistical learning problems. Finally, we show how an algorithm by Zhang can be used to solve this class of optimization problems efficiently.