General time consistent discounting

Authors:
Tor Lattimore;Marcus Hutter
Affiliations:
-;-
Venue:
Theoretical Computer Science
Year:
2014

Citing 7
Cited 0

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Self-Optimizing and Pareto-Optimal Policies in General Environments Based on Bayes-Mixtures

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability

Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Universal Intelligence: A Definition of Machine Intelligence

Minds and Machines
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
General discounting versus average reward

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory

Quantified Score

Hi-index	5.23

Visualization

Abstract

Modeling inter-temporal choice is a key problem in both computer science and economic theory. The discounted utility model of Samuelson is currently the most popular model for measuring the global utility of a time-series of local utilities. The model is limited by not allowing the discount function to change with the age of the agent. This is despite the fact that many agents, in particular humans, are best modelled with age-dependent discount functions. It is well known that discounting can lead to time-inconsistent behaviour where agents change their preferences over time. In this paper we generalise the discounted utility model to allow age-dependent discount functions. We then extend previous work in time-inconsistency to our new setting, including a complete characterisation of time-(in)consistent discount functions, the existence of sub-game perfect equilibrium policies where the discount function is time-inconsistent and a continuity result showing that ''nearly'' time-consistent discount rates lead to ''nearly'' time-consistent behaviour.