Proceedings of the seventh international conference (1990) on Machine learning
Technical Note: \cal Q-Learning
Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Reinforcement learning of non-Markov decision processes
Artificial Intelligence - Special volume on computational research on interaction and agency, part 2
Linear least-squares algorithms for temporal difference learning
Machine Learning - Special issue on reinforcement learning
Planning and acting in partially observable stochastic domains
Artificial Intelligence
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
A multiagent reinforcement learning algorithm using extended optimal response
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Multiagent Systems: A Survey from a Machine Learning Perspective
Autonomous Robots
Multiple model-based reinforcement learning
Neural Computation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Learning Policies with External Memory
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning Probabilistic Models for Decision-Theoretic Navigation of Mobile Robots
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning to Evaluate Go Positions via Temporal Difference Methods
Learning to Evaluate Go Positions via Temporal Difference Methods
Two Search Techniques for Imperfect Information Games and Application to Hearts
Two Search Techniques for Imperfect Information Games and Application to Hearts
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Multiplayer games: algorithms and approaches
Multiplayer games: algorithms and approaches
Nash q-learning for general-sum stochastic games
The Journal of Machine Learning Research
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
On-line EM Algorithm for the Normalized Gaussian Network
Neural Computation
Reinforcement learning for a CPG-driven biped robot
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Dynamic programming for partially observable stochastic games
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Value-function approximations for partially observable Markov decision processes
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Point-based value iteration: an anytime algorithm for POMDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Feature construction for reinforcement learning in hearts
CG'06 Proceedings of the 5th international conference on Computers and games
Computing optimal policies for partially observable decision processes using compact representations
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
A heuristic variable grid solution method for POMDPs
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Strategy-acquisition system for video trading card game
ACE '08 Proceedings of the 2008 International Conference on Advances in Computer Entertainment Technology
ACE'12 Proceedings of the 9th international conference on Advances in Computer Entertainment
Hi-index | 0.00 |
Games constitute a challenging domain of reinforcement learning (RL) for acquiring strategies because many of them include multiple players and many unobservable variables in a large state space. The difficulty of solving such realistic multiagent problems with partial observability arises mainly from the fact that the computational cost for the estimation and prediction in the whole state space, including unobservable variables, is too heavy. To overcome this intractability and enable an agent to learn in an unknown environment, an effective approximation method is required with explicit learning of the environmental model. We present a model-based RL scheme for large-scale multiagent problems with partial observability and apply it to a card game, hearts. This game is a well-defined example of an imperfect information game and can be approximately formulated as a partially observable Markov decision process (POMDP) for a single learning agent. To reduce the computational cost, we use a sampling technique in which the heavy integration required for the estimation and prediction can be approximated by a plausible number of samples. Computer simulation results show that our method is effective in solving such a difficult, partially observable multiagent problem.