Imitation and Reinforcement Learning in Agents with Heterogeneous Actions

Authors:
Bob Price;Craig Boutilier
Affiliations:
-;-
Venue:
AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Year:
2001

Citing 10
Cited 0

Learning in embedded systems

Learning in embedded systems
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Learning evaluation functions for global optimization and Boolean satisfiability

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Behavior-based primitives for articulated control

Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Robot Learning From Demonstration

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Implicit Imitation in Multiagent Reinforcement Learning

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning to Communicate Through Imitation in Autonomous Robots

ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Skill reconstruction as induction of LQ controllers with subgoals

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
LEAP: a learning apprentice for VLSI design

IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning techniques are increasingly being used to solve difficult problems in control and combinatorial optimization with promising results. Implicit imitation can accelerate reinforcement learning (RL) by augmenting the Bellman equations with information from the observation of expert agents (mentors). We propose two extensions that permit imitation of agents with heterogeneous actions: feasibility testing, which detects infeasible mentor actions, and k-step repair, which searches for plans that approximate infeasible actions. We demonstrate empirically that both of these extensions allow imitation agents to converge more quickly in the presence of heterogeneous actions.