Shifting Inductive Bias with Success-Story Algorithm, AdaptiveLevin Search, and Incremental Self-Improvement

Authors:
Jürgen Schmidhuber;Jieyu Zhao;Marco Wiering
Affiliations:
IDSIA, Corso Elvezia 36, CH-6900-Lugano, Switzerland E-mail: juergen@idsia.ch, jieyu@idsia.ch, marco@idsia.ch;IDSIA, Corso Elvezia 36, CH-6900-Lugano, Switzerland E-mail: juergen@idsia.ch, jieyu@idsia.ch, marco@idsia.ch;IDSIA, Corso Elvezia 36, CH-6900-Lugano, Switzerland E-mail: juergen@idsia.ch, jieyu@idsia.ch, marco@idsia.ch
Venue:
Machine Learning - Special issue on inductive transfer
Year:
1997

Citing 22
Cited 16

Randomness conservation inequalities; information and independence in mathematical theories

Information and Control
Stochastic systems: estimation, identification and adaptive control

Stochastic systems: estimation, identification and adaptive control
Active perception and reinforcement learning

Neural Computation
Reinforcement learning in Markovian and non-Markovian environments

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Principles of metareasoning

Artificial Intelligence - Special issue on knowledge representation
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
Learning in embedded systems

Learning in embedded systems
An introduction to Kolmogorov complexity and its applications

An introduction to Kolmogorov complexity and its applications
Reinforcement learning for robots using neural networks

Reinforcement learning for robots using neural networks
Deliberation scheduling for problem solving in time-constrained environments

Artificial Intelligence
Memoryless policies: theoretical limitations and practical results

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Adding temporary memory to ZCS

Adaptive Behavior
Continual learning in reinforcement environments

Continual learning in reinforcement environments
PALO: a probabilistic hill-climbing algorithm

Artificial Intelligence
Discovering neural nets with low Kolmogorov complexity and high generalization capability

Neural Networks
On the Length of Programs for Computing Finite Binary Sequences: statistical considerations

Journal of the ACM (JACM)
Learning to Predict by the Methods of Temporal Differences

Machine Learning
A Representation for the Adaptive Generation of Simple Sequential Programs

Proceedings of the 1st International Conference on Genetic Algorithms
Planning and Acting in Partially Observable Stochastic Domains

Planning and Acting in Partially Observable Stochastic Domains
Simple Principles of Metalearning

Simple Principles of Metalearning
The lack of a priori distinctions between learning algorithms

Neural Computation

Learning Team Strategies: Soccer Case Studies

Machine Learning
Reinforcement Learning Soccer Teams with Incomplete World Models

Autonomous Robots
Towards a Universal Theory of Artificial Intelligence Based on Algorithmic Probability and Sequential Decisions

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Learning to Learn Using Gradient Descent

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Market-Based Reinforcement Learning in Partially Observable Worlds

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Sequential Decision Making Based on Direct Search

Sequence Learning - Paradigms, Algorithms, and Applications
Exploring the predictable

Advances in evolutionary computing
Optimal Ordered Problem Solver

Machine Learning
Probabilistic incremental program evolution

Evolutionary Computation
Why evolution is not a good paradigm for program induction: a critique of genetic programming

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes

Anticipatory Behavior in Adaptive Learning Systems
The neuronal replicator hypothesis

Neural Computation
A neural network model for inter-problem adaptive online time allocation

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
A Monte-Carlo AIXI approximation

Journal of Artificial Intelligence Research
The automatic generation of mutation operators for genetic algorithms

Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
2013 Special Issue: First experiments with PowerPlay

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study task sequences that allow for speeding up the learner‘saverage reward intake through appropriate shifts of inductive bias(changes of the learner‘s policy). To evaluate long-term effects ofbias shifts setting the stage for later bias shifts we use the“success-story algorithm” (SSA). SSA is occasionally called attimes that may depend on the policy itself. It uses backtracking toundo those bias shifts that have not been empirically observed totrigger long-term reward accelerations (measured up until the currentSSA call). Bias shifts that survive SSA represent a lifelong successhistory. Until the next SSA call, they are considered useful andbuild the basis for additional bias shifts. SSA allows for pluggingin a wide variety of learning algorithms. We plug in (1) a novel,adaptive extension of Levin search and (2) a method for embedding thelearner‘s policy modification strategy within the policy itself(incremental self-improvement). Our inductive transfer case studiesinvolve complex, partially observable environments where traditionalreinforcement learning fails.