Active learning in heteroscedastic noise

Authors:
András Antos;Varun Grover;Csaba Szepesvári
Affiliations:
Computer and Automation Research Institute of the Hungarian Academy of Sciences, Kende u. 13-17, Budapest 1111, Hungary;Department of Computing Science, University of Alberta, Edmonton T6G 2E8, Canada;Computer and Automation Research Institute of the Hungarian Academy of Sciences, Kende u. 13-17, Budapest 1111, Hungary and Department of Computing Science, University of Alberta, Edmonton T6G 2E8 ...
Venue:
Theoretical Computer Science
Year:
2010

Citing 4
Cited 1

Asymptotically efficient adaptive control in stochastic regression models

Advances in Applied Mathematics
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Measure Theory and Probability Theory (Springer Texts in Statistics)

Measure Theory and Probability Theory (Springer Texts in Statistics)
Active learning with statistical models

Journal of Artificial Intelligence Research

Upper-confidence-bound algorithms for active learning in multi-armed bandits

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory

Quantified Score

Hi-index	5.23

Visualization

Abstract

We consider the problem of actively learning the mean values of distributions associated with a finite number of options. The decision maker can select which option to generate the next observation from, the goal being to produce estimates with equally good precision for all the options. If sample means are used to estimate the unknown values then the optimal solution, assuming that the distributions are known up to a shift, is to sample from each distribution proportional to its variance. No information other than the distributions' variances is needed to calculate the optimal solution. In this paper we propose an incremental algorithm that asymptotically achieves the same loss as an optimal rule. We prove that the excess loss suffered by this algorithm, apart from logarithmic factors, scales as n^-^3^/^2, which we conjecture to be the optimal rate. The performance of the algorithm is illustrated on a simple problem.