Real-valued Q-learning in multi-agent cooperation

Authors:
Kao-Shing Hwang;Chia-Yue Lo;Kim-Joan Chen
Affiliations:
; ;
Venue:
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Year:
2009

Citing 4
Cited 1

Technical Note: \cal Q-Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning

Management Science
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

Towards a Multiple-Lookahead-Levels agent reinforcement-learning technique and its implementation in integrated circuits

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a Q-learning with continuous action policy and extend this algorithm to a multi-agent system. We examine this algorithm in a task that there are two robots taking action independently but connected with a straight bar. The robots must cooperate to move to the goal and avoid the obstacles in the environment. Conventional Q-learning needs a pre-defined and discrete state space but fails to identify the variances of the different situation in the same state. We introduce a Stochastic Recording Real-Valued unit to Q-learning to differentiate the actions corresponding to different state inputs but categorized to the same state. This unit can be regarded as an action evaluation module, which models and produces the expected evaluation signal and an action selection unit that generates an action with the expectation of better performance using a probability distribution function that estimates an optimal action selection policy. The results from both the simulation and experiment demonstrate better performance and applicability of the proposed learning model.