Nonlinear credit assignment for musical sequences

  • Authors:
  • Judy A. Franklin;Victoria U. Manfredi

  • Affiliations:
  • Computer Science Department Smith College Northampton, MA;Computer Science Department Smith College Northampton, MA

  • Venue:
  • Second international workshop on Intelligent systems design and application
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reinforcement learning is an exciting possibility for generating music because of its ability to learn without explicit examples and to produce more than one response in a given state. We use reinforcement learning in the second phase of a jazz improvisor that learns to interactively play jazz with a human. The reinforcement signal is based on rules for improvisation. Because of time delays between note played and subsequent reinforcement, a critic adjusts the reinforcement signal. We describe this system and then examine the ability of a temporal difference critic to predict reinforcement for three different sequential musical phenomena. A nonlinear network with a linear TD output unit and context traces on input is able to successfully predict reinforcement values for these sequences and shows promise for use in musical reinforcement learning tasks.