Simple Principles of Metalearning

Authors:
Juergen Schmidhuber;Jieyu Zhao;Marco Wiering
Affiliations:
-;-;-
Venue:
Simple Principles of Metalearning
Year:
1996

Citing 0
Cited 5

Shifting Inductive Bias with Success-Story Algorithm, AdaptiveLevin Search, and Incremental Self-Improvement

Machine Learning - Special issue on inductive transfer
Learning to Learn Using Gradient Descent

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Optimal Ordered Problem Solver

Machine Learning
Why evolution is not a good paradigm for program induction: a critique of genetic programming

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Review:

The Knowledge Engineering Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of metalearning is to generate useful shifts of inductive bias by adapting the current learning strategy in a ``useful'''' way. Our learner leads a single life during which actions are continually executed according to the system''s internal state and current {\em policy} (a modifiable, probabilistic algorithm mapping environmental inputs and internal states to outputs and new internal states). An action is considered a learning algorithm if it can modify the policy. Effects of learning processes on later learning processes are measured using reward/time ratios. Occasional backtracking enforces success histories of still valid policy modifications corresponding to histories of lifelong reward accelerations. The principle allows for plugging in a wide variety of learning algorithms. In particular, it allows for embedding the learner''s policy modification strategy within the policy itself (self-reference). To demonstrate the principle''s feasibility in cases where conventional reinforcement learning fails, we test it in complex, non-Markovian, changing environments (``POMDPs''''). One of the tasks involves more than $10^{13}$ states, two learners that both cooperate and compete, and strongly delayed reinforcement signals (initially separated by more than 300,000 time steps).