A Bayesian/Information Theoretic Model of Learning to Learn viaMultiple Task Sampling

  • Authors:
  • Jonathan Baxter

  • Affiliations:
  • Department of Mathematics, London School of Economics and Department of Computer Science, Royal Holloway College, University of London. E-mail: jon@syseng.anu.edu.au

  • Venue:
  • Machine Learning - Special issue on inductive transfer
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Bayesian model of learning to learn by sampling from multiple tasksis presented. The multiple tasks are themselves generated by samplingfrom a distribution over an environment of related tasks. Such anenvironment is shown to be naturally modelled within a Bayesiancontext by the concept of an objective priordistribution. It is argued that for many common machine learning problems, although ingeneral we do not know the true (objective) prior for the problem, wedo have some idea of a set of possible priors to which the true priorbelongs. It is shown that under these circumstances a learner can useBayesian inference to learn the true prior by learning sufficientlymany tasks from the environment. In addition, bounds are given onthe amount of information required to learn a task when it issimultaneously learnt with several other tasks. The bounds show thatif the learner has little knowledge of the true prior, but thedimensionality of the true prior is small, then sampling multipletasks is highly advantageous. The theory is applied to the problem oflearning a common feature set or equivalently alow-dimensional-representation (LDR) for an environment of relatedtasks.