Average reward reinforcement learning: foundations, algorithms, and empirical results

Authors:
Sridhar Mahadevan
Affiliations:
-
Venue:
Machine Learning - Special issue on reinforcement learning
Year:
1996

Citing 0
Cited 52

On Average Versus Discounted Reward Temporal-Difference Learning

Machine Learning
Continuous-Action Q-Learning

Machine Learning
Metalearning and neuromodulation

Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine

Neural Networks - Computational models of neuromodulation
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Designing guide-path networks for automated guided vehicle system by using the Q-learning technique

Computers and Industrial Engineering
Long-term reward prediction in TD models of the dopamine system

Neural Computation
Variance-Penalized Reinforcement Learning for Risk-Averse Asset Allocation

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Open Theoretical Questions in Reinforcement Learning

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
LC-Learning: Phased Method for Average Reward Reinforcement Learning - Preliminary Results

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
LC-Learning: Phased Method for Average Reward Reinforcement Learning - Analysis of Optimal Criteria

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Multi-agent learning in extensive games with complete information

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
To buy or not to buy: mining airfare data to minimize ticket purchase price

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Applied Intelligence
A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis

Machine Learning
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
General methodology 1: optimising discrete event simulation models using a reinforcement learning agent

Proceedings of the 34th conference on Winter simulation: exploring new frontiers
Privacy preserving learning in negotiation

Proceedings of the 2005 ACM symposium on Applied computing
Dynamic preferences in multi-criteria reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms

Neural Computation
A reinforcement learning algorithm to minimize the mean tardiness of a single machine with controlled capacity

Proceedings of the 38th conference on Winter simulation
Hierarchical Average Reward Reinforcement Learning

The Journal of Machine Learning Research
Tuning Local Search by Average-Reward Reinforcement Learning

Learning and Intelligent Optimization
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Dynamic Power Management for Sensor Node in WSN Using Average Reward MDP

WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
Thresholded rewards: acting optimally in timed, zero-sum games

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Existence of multiagent equilibria with limited agents

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Average-reward decentralized Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A neurocomputational model for cocaine addiction

Neural Computation
Using continuous action spaces to solve discrete problems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Research on improvement of model-free average reward reinforcement learning and its simulation experiment

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Reinforcement learning as a means of dynamic aggregate QoS provisioning

Art-QoS'03 Proceedings of the 2003 international conference on Architectures for quality of service in the internet
Smartlocks: lock acquisition scheduling for self-aware synchronization

Proceedings of the 7th international conference on Autonomic computing
A minimum relative entropy principle for learning and acting

Journal of Artificial Intelligence Research
An average-reward reinforcement learning algorithm for computing bias-optimal policies

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Auto-exploratory average reward reinforcement learning

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Adaptation-based programming in java

Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation
Internal-time temporal difference model for neural value-based decision making

Neural Computation
Job control in heterogeneous computing systems

Journal of Computer and Systems Sciences International
Smart data structures: an online machine learning approach to multicore data structures

Proceedings of the 8th ACM international conference on Autonomic computing
Towards proactive event-driven computing

Proceedings of the 5th ACM international conference on Distributed event-based system
Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning

Computers and Operations Research
Opponent learning for multi-agent system simulation

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Brief paper: Average cost temporal-difference learning

Automatica (Journal of IFAC)
Robustness of optimal channel reservation using handover prediction in multiservice wireless networks

Wireless Networks
R(λ) imitation learning for automatic generation control of interconnected power grids

Automatica (Journal of IFAC)
Learning-Based test programming for programmers

ISoLA'12 Proceedings of the 5th international conference on Leveraging Applications of Formal Methods, Verification and Validation: technologies for mastering change - Volume Part I
Learning classifier system with average reward reinforcement learning

Knowledge-Based Systems
Cooperating with a markovian ad hoc teammate

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Multiagent learning in the presence of memory-bounded agents

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Average reward reinforcement learning: foundations, algorithms, and empirical results

Quantified Score

Visualization

Abstract