Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Strategies for Affect-Controlled Action-Selection in Soar-RL
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Extending the Soar Cognitive Architecture
Proceedings of the 2008 conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference
Analogical learning in a turn-based strategy game
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement learning as heuristic for action-rule preferences
ProMAS'10 Proceedings of the 8th international conference on Programming Multi-Agent Systems
Exploiting persistent mappings in cross-domain analogical learning of physical domains
Artificial Intelligence
Game designers training first person shooter bots
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Deconstructing reinforcement learning in sigma
AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Hi-index | 0.00 |
In this paper, we describe an architectural modification to Soar that gives a Soar agent the opportunity to learn statistical information about the past success of its actions and utilize this information when selecting an operator. This mechanism serves the same purpose as production utilities in ACT-R, but the implementation is more directly tied to the standard definition of the reinforcement learning (RL) problem. The paper explains our implementation, gives a rationale for adding an RL capability to Soar, and shows results for Soar-RL agents' performance on two tasks.