The society of mind
Planning as search: a quantitative approach
Artificial Intelligence
Learning to Perceive and Act by Trial and Error
Machine Learning
Genetic programming (videotape): the movie
Genetic programming (videotape): the movie
Temporal difference learning and TD-Gammon
Communications of the ACM
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A Critical Review of Classifier Systems
Proceedings of the 3rd International Conference on Genetic Algorithms
Relational Reinforcement Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Implementing Semantic Network Structures Using the Classifier System
Proceedings of the 1st International Conference on Genetic Algorithms
Computation: finite and infinite machines
Computation: finite and infinite machines
Evolution of Cooperative Problem Solving in an Artificial Economy
Neural Computation
Zcs: A zeroth level classifier system
Evolutionary Computation
Collective intelligence in combinatorial games
ASM '07 The 16th IASTED International Conference on Applied Simulation and Modelling
Hi-index | 0.00 |
We study the problem of how a computer program can learn, by interacting with an environment, to return an algorithm for solving a class of problems. The two example domains studied in this paper are Blocks World stacking problems and Rubik's Cube. Our approach is to simulate the evolution of an artificial economy of computer programs called "agents". Simple rules imposed on the economy result in credit assignment, factoring the problem of evolving an overall program for the class of problems into simpler problems of evolving agents that specialize on aspects of the problem and collaborate to solve the overall class. In this paper our agents are Post Production Systems. Our system, called Hayek4, has learned from random examples a program that solves arbitrary block stacking problems. The program essentially consists of about 5 learned rules and some learned control information. Solution of an instance with n blocks in its goal stack requires the automatic chaining of the rules in correct sequence about 2n deep. Hayek4 has also learned to correct Rubik's cubes scrambled with up to about 7 random rotations. These results can also be seen in the automatic theorem proving context as a way to learn domain knowledge allowing one to automatically generate compact proofs.