Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Value-based policy teaching with active indirect elicitation
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
A framework for sequential planning in multi-agent settings
Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs
Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Using iterated reasoning to predict opponent strategies
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
The complexity of decentralized control of Markov decision processes
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
Despite recent advances in getting autonomous robots to follow instructions from humans, strategically intelligent robot behaviours have received less attention. Strategic intelligence entails influence over the beliefs of other interacting agents, possibly adversarial. In this paper, we present a learning framework for strategic interaction shaping in physical robotic systems, where an autonomous robot must lead an unknown adversary to a desired joint state. Offline, we learn composable interaction templates, represented as shaping regions and tactics, from human demonstrations. Online, the agent empirically learns the adversary's responses to executed tactics, and the reachability of different regions. Interaction shaping is effected by selecting tactic sequences through Bayesian inference over the expected reachability of their traversed regions. We experimentally evaluate our approach in an adversarial soccer penalty task between NAO robots, by comparing an autonomous shaping robot with and against human-controlled agents. Results, based on 650 trials and a diverse group of 30 human subjects, demonstrate that the shaping robot performs comparably to the best human-controlled robots, in interactions with a heuristic autonomous adversary. The shaping robot is also shown to progressively improve its influence over a more challenging strategic adversary controlled by an expert human user.