Learning Sequences of Compatible Actions Among Agents
Artificial Intelligence Review
Learning intelligent behavior in a non-stationary and partially observable environment
Artificial Intelligence Review
A Novel Approach to Separate Handwritten Connected Digits
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Employing OLAP mining for multiagent reinforcement learning
Design and application of hybrid intelligent systems
Coordinating Multiple Agents via Reinforcement Learning
Autonomous Agents and Multi-Agent Systems
A layered approach to learning coordination knowledge in multiagent environments
Applied Intelligence
Coordination in multiagent reinforcement learning systems by virtual reinforcement signals
International Journal of Knowledge-based and Intelligent Engineering Systems
NEW MODEL BASED ON CELLULAR AUTOMATA AND MULTIAGENT TECHNIQUES
Cybernetics and Systems
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
International Journal of Data Mining and Bioinformatics
Multiagent association rules mining in cooperative learning systems
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Reinforcement learning based on multi-agent in robocup
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Hi-index | 0.00 |
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning is a generic method that suits the needs of MA learning in many aspects. This paper presents two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agents do not require explicit communication among themselves to learn coordinated behavior. The first coordination mechanism is the perceptual coordination mechanism, where other agents are included in state descriptions and coordination information is learned from state transitions. The second is the observing coordination mechanism, which also includes other agents in state descriptions and additionally the rewards of nearby agents are observed from the environment. The observed rewards and agent's own reward are used to construct an optimal policy. This way, the latter mechanism tends to increase region-wide joint rewards. The selected experimented domain is adversarial food-collecting world (AFCW), which can be configured both as single and multiagent environments. Function approximation and generalization techniques are used because of the huge state space. Experimental results show the effectiveness of these mechanisms