Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Queueing networks and Markov chains: modeling and performance evaluation with computer science applications
Building agent teams using an explicit teamwork model and learning
Artificial Intelligence - Special issue on Robocop: the first step
Queueing Network Modeling of Computer Communication Networks
ACM Computing Surveys (CSUR)
Response Surface Methodology: Process and Product in Optimization Using Designed Experiments
Response Surface Methodology: Process and Product in Optimization Using Designed Experiments
Simulation Modeling and Analysis
Simulation Modeling and Analysis
The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
A selection-mutation model for q-learning in multi-agent systems
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Least-squares policy iteration
The Journal of Machine Learning Research
Parameter space exploration with Gaussian process trees
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Introduction to Multi-Agent Modified Q-Learning Routing for Computer Networks
AICT-SAPIR-ELETE '05 Proceedings of the Advanced Industrial Conference on Telecommunications/Service Assurance with Partial and Intermittent Resources Conference/E-Learning on Telecommunications Workshop
Learning the task allocation game
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Management Science
Multiagent reinforcement learning and self-organization in a network of agents
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Active learning for directed exploration of complex systems
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Dynamic analysis of multiagent Q-learning with ε-greedy exploration
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online exploration in least-squares policy iteration
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Waiting Time Sensitivities of Social and Random Graph Models
ASONAM '09 Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining
Active learning with statistical models
Journal of Artificial Intelligence Research
A Bayesian sampling approach to exploration in reinforcement learning
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Analysis of a stochastic model of adaptive task allocation in robots
Engineering Self-Organising Systems
A Comprehensive Survey of Multiagent Reinforcement Learning
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
Large-scale simulation studies are necessary to study the learning behaviour of individual agents and the overall system dynamics. One reason is that planning algorithms to find optimal solutions to fully observable general decentralised Markov decision problems do not admit to polynomial-time worst-case complexity bounds. Additionally, agent interaction often implies a non-stationary environment which does not lend itself to asymptotically greedy policies. Therefore, policies with a constant level of exploration are required to be able to adapt continuously. This paper casts the application domain of distributed task assignment into the formalisms of queueing theory, complex networks and decentralised Markov decision problems to analyse the impact of the momentum of a standard back-propagation neural network function approximator and the discount factor of $SARSA(0)$ reinforcement learning and the $\epsilon$ parameter of the $\epsilon$-greedy policy. For this purpose large queueing networks of one thousand interacting agents are evolved. A Kriging metamodel is fitted and in combination with simulated annealing optimal operating conditions with respect to the total average response time are found. The insights gained from this study are significant in that they provide guidance in deploying large-scale distributed task assignment systems modelled as multi-agent queueing networks.