On the optimality of conditional expectation as a Bregman predictor

Authors:
A. Banerjee;X. Guo;H. Wang
Affiliations:
Dept. of Electr. & Comput. Eng., Univ. of Texas, Austin, TX, USA;-;-
Venue:
IEEE Transactions on Information Theory
Year:
2005

Citing 0
Cited 15

Clustering with Bregman Divergences

The Journal of Machine Learning Research
Mixed Bregman Clustering with Approximation Guarantees

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Intrinsic Geometries in Learning

Emerging Trends in Visual Computing
Cost-sensitive learning based on Bregman divergences

Machine Learning
Bregman divergences in the (m×k)-partitioning problem

Computational Statistics & Data Analysis
Adaptive fuzzy filtering in a deterministic setting

IEEE Transactions on Fuzzy Systems
Probabilistic coherence and proper scoring rules

IEEE Transactions on Information Theory
Sided and symmetrized Bregman centroids

IEEE Transactions on Information Theory
Aggregation functions based on penalties

Fuzzy Sets and Systems
Quantization and clustering with Bregman divergences

Journal of Multivariate Analysis
Theory and Use of the EM Algorithm

Foundations and Trends in Signal Processing
Optimality and stability of the K-hyperline clustering algorithm

Pattern Recognition Letters
Information, Divergence and Risk for Binary Experiments

The Journal of Machine Learning Research
Bregman clustering for separable instances

SWAT'10 Proceedings of the 12th Scandinavian conference on Algorithm Theory
Calibration and regret bounds for order-preserving surrogate losses in learning to rank

Machine Learning

Quantified Score

Hi-index	754.96

Visualization

Abstract

We consider the problem of predicting a random variable X from observations, denoted by a random variable Z. It is well known that the conditional expectation E[X|Z] is the optimal L2 predictor (also known as "the least-mean-square error" predictor) of X, among all (Borel measurable) functions of Z. In this orrespondence, we provide necessary and sufficient conditions for the general loss functions under which the conditional expectation is the unique optimal predictor. We show that E[X|Z] is the optimal predictor for all Bregman loss functions (BLFs), of which the L2 loss function is a special case. Moreover, under mild conditions, we show that the BLFs are exhaustive, i.e., if for every random variable X, the infimum of E[F(X,y)] over all constants y is attained by the expectation E[X], then F is a BLF.