Using collective intelligence to route Internet traffic
Proceedings of the 1998 conference on Advances in neural information processing systems II
Multiagent learning using a variable learning rate
Artificial Intelligence
Exploration Strategies for Model-based Learning in Multi-agent Systems: Exploration Strategies
Autonomous Agents and Multi-Agent Systems
Selection of information types based on personal utility: a testbed for traffic information markets
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Smooth traffic flow with a cooperative car navigation system
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Learning the task allocation game
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Multiagent reinforcement learning and self-organization in a network of agents
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Regret based dynamics: convergence in weakly acyclic games
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Learning procedural knowledge to better coordinate
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
A new Q-learning algorithm based on the metropolis criterion
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hi-index | 0.00 |
In this article, we propose two methods to adapt parameters in multi-agent reinforcement learning (MARL) for repeated resource sharing problems(RRSP). Resource sharing problems (RSP) are important and widely-applicable frameworks on MARL. RRSP is a variation of RSP in which agents select resources repeatedly and periodically. We have been proposing a learning method called Moderated Global Information (MGI) for MARL in RRSP. However, we need carefully adapt several parameters in MGI, especially temperature parameter T in Boltzmann selection in agent behavior and modification parameter L , to converge the learning into suitable states. In order to avoid this difficulty, we propose two methods to adjust these parameters according to the performance of each agent and statistical behaviors of agents. Results of several experiments tell us that the proposed methods are robust against changes of environments and force agent-behaviors to the optimal situation.