Optimal agent cooperation with local information

  • Authors:
  • Jan De Mot;Eric Feron

  • Affiliations:
  • Massachusetts Institute of Technology;Massachusetts Institute of Technology

  • Venue:
  • Optimal agent cooperation with local information
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multi-agent systems are in general believed to be more efficient, robust, and versatile than their single-agent equivalents. However, it is not an easy task to design strategies that fully exploit the multi-agent benefits, and with this in mind we address several multi-agent system design issues. Specifically, it is of central importance to determine the optimal agent group composition, which involves a trade-off between the cost and performance increase per additional agent. Further, truly autonomous agents solely rely on on-board environment measurements, the design of which requires quantifying the multi-agent performance as a function of the locally observed environment areas. In this thesis, we focus on the collaborative search for individually rewarding resources, i.e. it is possible for multiple agents to incur the same reward. The system objective is to maximize the aggregate rewards incurred. Motivated by a cooperative surveillance context, we formulate a graph traversal problem on an unbounded structured graph, and restrain the agent motion spatially so that only the lateral agent separation is controlled. We model the problem mathematically as a discrete, infinite state, infinite horizon Dynamic Program and convert it using standard techniques to an equivalent Linear Program (LP) with infinitely many constraints. The graph spatial invariance allows to decompose the LP into a set of infinitely many coupled LPs, each with finitely many constraints. We establish that the unique bounded function that simultaneously satisfies the latter LPs is the problem optimal value function. Based on this, we compute the two-agent optimal value function explicitly as the solution of an LP with finitely many constraints for small agent separations, and implicitly in the form of a recursion for large agent separations, satisfying adequate connection constraints. Finally, we propose a similar method to compute the state probability distribution in steady state under an optimal policy, summarizing the agent behavior at large separations in a set of connection constraints, which is sufficient to compute the probability distribution at small separations. We analyze and compare the optimal performance of various problem instances. We confirm and quantify the intuition that the performance increases with the group size. Some results stand out: for cone-shaped local observation, two agents incur 25% less cost than a single agent in a mine field type environment (scarce though high costs); further, for some environment specifics, a third agent provides little to no performance increase. Then, we compare various local observation zones, and quantify their effect on the overall group performance. Finally, we study the agent spatial distribution under an optimal policy, and observe that as rewards are scarcer, the agents tend to spread in order to gather information on a larger environment part. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)