Unifying Temporal and Structural Credit Assignment Problems

Authors:
Adrian K. Agogino;Kagan Tumer
Affiliations:
UCSC, NASA Ames Research Center;NASA Ames Research Center
Venue:
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Year:
2004

Citing 12
Cited 2

Practical Issues in Temporal Difference Learning

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
Using collective intelligence to route Internet traffic

Proceedings of the 1998 conference on Advances in neural information processing systems II
Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer

Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer
Learning sequences of actions in collectives of autonomous agents

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multiagent Systems: A Survey from a Machine Learning Perspective

Autonomous Robots
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Team formation and communication restrictions in collectives

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Collectives and Design Complex Systems

Collectives and Design Complex Systems
Reinforcement learning in distributed domains: beyond team games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Efficient evaluation functions for evolving coordination

Evolutionary Computation
QUICR-learning for multi-agent coordination

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Single-agent reinforcement learners in time-extended domains and multi-agent systems share a common difficulty known as the credit assignment problem. Multiagent systems have the structural credit assignment problem of determining the contributions of a particular agent to a common task. Instead, time-extended single-agent systems have the temporal credit assignment problem of determining the contribution of a particular action to the quality of the full sequence of actions. Traditionally these two problems are considered different and are handled in separate ways. In this article we show how these two forms of the credit assignment problem are equivalent. In this unified framework, a single-agent Markov decision process can be broken down into a single-time-step multiagent process. Furthermore we show that Monte Carlo estimation or Q-learning (depending on whether the values of resulting actions in the episode are known at the time of learning) are equivalent to different agent utility functions in a multi-agent system. This equivalence shows how an often neglected issue inmulti-agent systems is equivalent to a well-known deficiency in multi-time-step learning and lays the basis for solving time-extended multi-agent problems, where both credit assignment problems are present.