Learning curve bounds for a Markov decision process with undiscounted rewards
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
On Average Versus Discounted Reward Temporal-Difference Learning
Machine Learning
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Discrete Event Dynamic Systems
Reinforcement Learning for Control of Traffic and Access Points in Intelligent Wireless ATM Networks
Proceedings of the International Conference, 7th Fuzzy Days on Computational Intelligence, Theory and Applications
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
Journal of Artificial Intelligence Research
Process-oriented planning and average-reward optimality
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning and adaptation of a policy for dynamic order acceptance in make-to-order manufacturing
Computers and Industrial Engineering
A minimum relative entropy principle for learning and acting
Journal of Artificial Intelligence Research
An average-reward reinforcement learning algorithm for computing bias-optimal policies
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Auto-exploratory average reward reinforcement learning
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Expert Systems with Applications: An International Journal
Compound reinforcement learning: theory and an application to finance
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learning classifier system with average reward reinforcement learning
Knowledge-Based Systems
Hi-index | 0.00 |