Conditional random fields for multi-agent reinforcement learning

Authors:
Xinhua Zhang;Douglas Aberdeen;S. V. N. Vishwanathan
Affiliations:
Australian National University, Canberra, Australia;Australian National University, Canberra, Australia;Australian National University, Canberra, Australia
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 13
Cited 4

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Natural gradient works efficiently in learning

Neural Computation
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Distributed Value Functions

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Sequential Optimality and Coordination in Multiagent Systems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Coordinated Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Exponential families for conditional random fields

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
From fields to trees

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Natural actor-critic

ECML'05 Proceedings of the 16th European conference on Machine Learning

Natural Actor-Critic

Neurocomputing
Solving multiagent assignment Markov decision processes

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Cooperative multi-robot reinforcement learning: a framework in hybrid state space

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Monte-Carlo expectation maximization for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically distributed (iid). In this paper we explore the use of CRFs in a class of temporal learning algorithms, namely policy-gradient reinforcement learning (RL). Now the labels are no longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal joint action. Our experiments include a synthetic network alignment problem, a distributed sensor network, and road traffic control; clearly outperforming RL methods which do not model the proper joint policy.