A framework for recognizing multi-agent action from visual evidence
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Recognizing Probabilistic Opponent Movement Models
RoboCup 2001: Robot Soccer World Cup V
Fast and complete symbolic plan recognition
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
UCT for tactical assault planning in real-time strategy games
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Monte Carlo search applied to card selection in magic: the gathering
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Hi-index | 0.00 |
One drawback with using plan recognition in adversarial games is that often players must commit to a plan before it is possible to infer the opponent's intentions. In such cases, it is valuable to couple plan recognition with plan repair, particularly in multi-agent domains where complete replanning is not computationally feasible. This paper presents a method for learning plan repair policies in real-time using Upper Confidence Bounds applied to Trees (UCT). We demonstrate how these policies can be coupled with plan recognition in an American football game (Rush 2008) to create an autonomous offensive team capable of responding to unexpected changes in defensive strategy. Our real-time version of UCT learns play modifications that result in a significantly higher average yardage and fewer interceptions than either the baseline game or domain-specific heuristics. Although it is possible to use the actual game simulator to measure reward offline, to execute UCT in real-time demands a different approach; here we describe two modules for reusing data from offline UCT searches to learn accurate state and reward estimators.