Generating three binary addition algorithms using reinforcement programming

  • Authors:
  • Spencer White;Tony Martinez;George Rudolph

  • Affiliations:
  • Duvall, WA;Brigham Young University, Provo, UT;Department of Mathematics and Computer Science, Charleston, SC

  • Venue:
  • Proceedings of the 48th Annual Southeast Regional Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reinforcement Programming (RP) is a new technique for automatically generating a computer program using reinforcement learning methods. This paper describes how RP learned to generate code for three binary addition problems: simulate a full adder circuit, increment a binary number, and add two binary numbers. Each problem is presented as an extension of the one previous to it, which provides an introduction to the practical application of RP. Each solution uses a dynamic, episodic form of delayed Q-Learning algorithm. "Dynamic" means that grows the policy during learning, and prunes it before the policy is translated to source code. This is different from Q-Learning models that use fixed-size tables or neural net function approximators to store q-values associated with (state, action) pairs. The states, actions, rewards, other parameters, and results of experiments are presented for each of the three problems.