Bayesian interaction shaping: learning to influence strategic interactions in mixed robotic domains

Authors:
Aris Valtazanos;Subramanian Ramamoorthy
Affiliations:
University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom
Venue:
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Year:
2013

Citing 8
Cited 0

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Value-based policy teaching with active indirect elicitation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Using iterated reasoning to predict opponent strategies

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
The complexity of decentralized control of Markov decision processes

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite recent advances in getting autonomous robots to follow instructions from humans, strategically intelligent robot behaviours have received less attention. Strategic intelligence entails influence over the beliefs of other interacting agents, possibly adversarial. In this paper, we present a learning framework for strategic interaction shaping in physical robotic systems, where an autonomous robot must lead an unknown adversary to a desired joint state. Offline, we learn composable interaction templates, represented as shaping regions and tactics, from human demonstrations. Online, the agent empirically learns the adversary's responses to executed tactics, and the reachability of different regions. Interaction shaping is effected by selecting tactic sequences through Bayesian inference over the expected reachability of their traversed regions. We experimentally evaluate our approach in an adversarial soccer penalty task between NAO robots, by comparing an autonomous shaping robot with and against human-controlled agents. Results, based on 650 trials and a diverse group of 30 human subjects, demonstrate that the shaping robot performs comparably to the best human-controlled robots, in interactions with a heuristic autonomous adversary. The shaping robot is also shown to progressively improve its influence over a more challenging strategic adversary controlled by an expert human user.