Reinforcement learning as heuristic for action-rule preferences

Authors:
Joost Broekens;Koen Hindriks;Pascal Wiggers
Affiliations:
Man-Machine Interaction department (MMI), Delft University of Technology, The Netherlands;Man-Machine Interaction department (MMI), Delft University of Technology, The Netherlands;Man-Machine Interaction department (MMI), Delft University of Technology, The Netherlands
Venue:
ProMAS'10 Proceedings of the 8th international conference on Programming Multi-Agent Systems
Year:
2010

Citing 9
Cited 1

SOAR: an architecture for general intelligence

Artificial Intelligence
Learning action strategies for planning domains

Artificial Intelligence
Relational reinforcement learning

Machine Learning - Special issue on inducive logic programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Strategies for Affect-Controlled Action-Selection in Soar-RL

IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Learning action strategies for planning domains using genetic programming

EvoWorkshops'03 Proceedings of the 2003 international conference on Applications of evolutionary computing
A robust and fast action selection mechanism for planning

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Soar-RL: integrating reinforcement learning with Soar

Cognitive Systems Research
Enhancing the Adaptation of BDI Agents Using Learning Techniques

International Journal of Agent Technologies and Systems

Integrating learning into a BDI Agent for environments with changing dynamics

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Quantified Score

Hi-index	0.00

Visualization

Abstract

A common action selection mechanism used in agent-oriented programming is to base action selection on a set of rules. Since rules need not be mutually exclusive, agents are often underspecified. This means that the decision-making of such agents leaves room for multiple choices of actions. Underspecification implies there is potential for improvement or optimalization of the agent's behavior. Such optimalization, however, is not always naturally coded using BDI-like agent concepts. In this paper, we propose an approach to exploit this potential for improvement using reinforcement learning. This approach is based on learning rule priorities to solve the rule-selection problem, and we show that using this approach the behavior of an agent is significantly improved. Key here is the use of a state representation that combines the set of rules of the agent with a domain-independent heuristic based on the number of active goals. Our experiments show that this provides a useful generic base for learning while avoiding the state-explosion problem or overfitting.