Reinforcement learning algorithms for average-payoff Markovian decision processes

Authors:
Satinder P. Singh
Affiliations:
-
Venue:
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Year:
1994

Citing 0
Cited 15

Learning curve bounds for a Markov decision process with undiscounted rewards

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
On Average Versus Discounted Reward Temporal-Difference Learning

Machine Learning
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Discrete Event Dynamic Systems
Reinforcement Learning for Control of Traffic and Access Points in Intelligent Wireless ATM Networks

Proceedings of the International Conference, 7th Fuzzy Days on Computational Intelligence, Theory and Applications
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Truncating temporal differences: on the efficient implementation of TD (λ) for reinforcement learning

Journal of Artificial Intelligence Research
Process-oriented planning and average-reward optimality

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning and adaptation of a policy for dynamic order acceptance in make-to-order manufacturing

Computers and Industrial Engineering
A minimum relative entropy principle for learning and acting

Journal of Artificial Intelligence Research
An average-reward reinforcement learning algorithm for computing bias-optimal policies

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Auto-exploratory average reward reinforcement learning

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Comparing a class of dynamic model-based reinforcement learning schemes for handoff prioritization in mobile communication networks

Expert Systems with Applications: An International Journal
Compound reinforcement learning: theory and an application to finance

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learning classifier system with average reward reinforcement learning

Knowledge-Based Systems
Intelligent controllers for bi-objective dynamic scheduling on a single machine with sequence-dependent setups

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract