Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Implicit Negotiation in Repeated Games
ATAL '01 Revised Papers from the 8th International Workshop on Intelligent Agents VIII
A polynomial-time Nash equilibrium algorithm for repeated games
Decision Support Systems - Special issue: The fourth ACM conference on electronic commerce
If multi-agent learning is the answer, what is the question?
Artificial Intelligence
Potential-based shaping and Q-value initialization are equivalent
Journal of Artificial Intelligence Research
Planning against fictitious players in repeated normal form games
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Theoretical considerations of potential-based reward shaping for multi-agent systems
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Multi-agent, reward shaping for RoboCup KeepAway
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Policy invariance under reward transformations for general-sum stochastic games
Journal of Artificial Intelligence Research
Dynamic potential-based reward shaping
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Organizational design principles and techniques for decision-theoretic agents
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Learning potential functions and their representations for multi-task reinforcement learning
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to near-optimal behavior. In this paper, we introduce social reward shaping, which is reward shaping applied in the multiagent-learning framework. We present preliminary experiments in the iterated Prisoner's dilemma setting that show that agents using social reward shaping appropriately can behave more effectively than other classical learning and non-learning strategies. In particular, we show that these agents can both lead---encourage adaptive opponents to stably cooperate---and follow---adopt a best-response strategy when paired with a fixed opponent---where better known approaches achieve only one of these objectives.