Reinforcement learning with time

Authors:
Daishi Harada
Affiliations:
Dept. EECS, Computer Science Division, University of California, Berkeley
Venue:
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Year:
1997

Citing 4
Cited 0

Dynamic programming: deterministic and stochastic models

Dynamic programming: deterministic and stochastic models
On the power of randomization in online algorithms

STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper steps back from the standard infinite horizon formulation of reinforcement learning problems to consider the simpler case of finite horizon problems. Although finite horizon problems may be solved using infinite horizon learning algorithms by recasting the problem as an infinite horizon problem over a state space extended to include time, we show that such an applica tion of infinite horizon learning algorithms does not make use of what is known about the environment structure, and is therefore inefficient. Preserving a notion of time within the environment allows us to consider extending the environment model to include, for example, random action duration. Such extentions allow us to model non-Markov environments which can be learned using reinforcement learning algorithms.