Nonlinear credit assignment for musical sequences

Authors:
Judy A. Franklin;Victoria U. Manfredi
Affiliations:
Computer Science Department Smith College Northampton, MA;Computer Science Department Smith College Northampton, MA
Venue:
Second international workshop on Intelligent systems design and application
Year:
2002

Citing 4
Cited 0

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Musical networks

Musical networks
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is an exciting possibility for generating music because of its ability to learn without explicit examples and to produce more than one response in a given state. We use reinforcement learning in the second phase of a jazz improvisor that learns to interactively play jazz with a human. The reinforcement signal is based on rules for improvisation. Because of time delays between note played and subsequent reinforcement, a critic adjusts the reinforcement signal. We describe this system and then examine the ability of a temporal difference critic to predict reinforcement for three different sequential musical phenomena. A nonlinear network with a linear TD output unit and context traces on input is able to successfully predict reinforcement values for these sequences and shows promise for use in musical reinforcement learning tasks.