Data networks
Technical Note: \cal Q-Learning
Machine Learning
Minimum delay routing in stochastic networks
IEEE/ACM Transactions on Networking (TON)
Competitive routing in multiuser communication networks
IEEE/ACM Transactions on Networking (TON)
Making greed work in networks: a game-theoretic analysis of switch service disciplines
IEEE/ACM Transactions on Networking (TON)
Achieving network optima using Stackelberg routing strategies
IEEE/ACM Transactions on Networking (TON)
Congestion resulting from increased capacity in single-server queueing networks
IEEE/ACM Transactions on Networking (TON)
Online learning about other agents in a dynamic multiagent system
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Anytime coalition structure generation with worst case guarantees
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning for call admission control and routing in integrated service networks
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Adaptivity in agent-based routing for data networks
AGENTS '00 Proceedings of the fourth international conference on Autonomous agents
Using collective intelligence to route Internet traffic
Proceedings of the 1998 conference on Advances in neural information processing systems II
Distributed Artificial Intelligence
Distributed Artificial Intelligence
Learning sequences of actions in collectives of autonomous agents
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A Roadmap of Agent Research and Development
Autonomous Agents and Multi-Agent Systems
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
TPOT-RL Applied to Network Routing
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Collective Intelligence and Braess' Paradox
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Ants and reinforcement learning: a case study in routing in dynamic networks
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Adaptive load balancing: a study in multi-agent learning
Journal of Artificial Intelligence Research
Social dilemmas in computational ecosystems
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Reinforcement learning in distributed domains: beyond team games
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Architecting noncooperative networks
IEEE Journal on Selected Areas in Communications
Product Distribution Theory for Control of Multi-Agent Systems
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Vector Valued Markov Decision Process for robot platooning
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
DReL: a middleware for wireless sensor networks management using reinforcement learning techniques
Proceedings of the 5th International Workshop on Middleware Tools, Services and Run-Time Support for Sensor Networks
Cognitive policy learner: biasing winning or losing strategies
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
Addressing hard constraints in the air traffic problem through partitioning and difference rewards
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Hi-index | 0.00 |
We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system (MAS) so that they work synergistically to maximize a global utility. The particular problem domain we explore is the control of network routing by placing agents on all the routers in the network. Conventional approaches to this task have the agents all use the Ideal Shortest Path routing Algorithm (ISPA). We demonstrate that in many cases, due to the side-effects of one agent's actions on another agent's performance, having agents use ISPA's is suboptimal as far as global aggregate cost is concerned, even when they are only used to route infinitesimally small amounts of traffic. The utility functions of the individual agents are not "aligned" with the global utility, intuitively speaking. As a particular example of this we present an instance of Braess' paradox in which adding new links to a network whose agents all use the ISPA results in a decrease in overall throughput. We also demonstrate that load-balancing, in which the agents' decisions are collectively made to optimize the global cost incurred by all traffic currently being routed, is suboptimal as far as global cost averaged across time is concerned. This is also due to "side-effects", in this case of current routing decision on future traffic. The mathematics of Collective Intelligence (COIN) is concerned precisely with the issue of avoiding such deleterious side-effects in multi-agent systems, both over time and space. We present key concepts from that mathematics and use them to derive an algorithm whose ideal version should have better performance than that of having all agents use the ISPA, even in the infinitesimal limit. We present experiments verifying this, and also showing that a machine-learning-based version of this COIN algorithm in which costs are only imprecisely estimated via empirical means (a version potentially applicable in the real world) also outperforms the ISPA, despite having access to less information than does the ISPA. In particular, this COIN algorithm almost always avoids Braess' paradox.