A model for reasoning about persistence and causation
Computational Intelligence
Introduction to the theory of neural computation
Introduction to the theory of neural computation
Information processing in dynamical systems: foundations of harmony theory
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Learning and relearning in Boltzmann machines
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A survey of algorithmic methods for partially observed Markov decision processes
Annals of Operations Research
Connectionist learning of belief networks
Artificial Intelligence
Solving very large weakly coupled Markov decision processes
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
An Introduction to Variational Methods for Graphical Models
Machine Learning
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Stochastic dynamic programming with factored representations
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Probabilistic Networks and Expert Systems
Probabilistic Networks and Expert Systems
Neuro-Dynamic Programming
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Training products of experts by minimizing contrastive divergence
Neural Computation
Theoretical Results on Reinforcement Learning with Temporally Abstract Options
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Learning Policies with External Memory
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning to Cooperate via Policy Search
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
UCP-Networks: A Directed Graphical Representation of Conditional Utilities
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Vector-space Analysis of Belief-state Approximation for POMDPs
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Dynamic Programming
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Variational methods for inference and estimation in graphical models
Variational methods for inference and estimation in graphical models
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Computing optimal policies for partially observable decision processes using compact representations
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Reasoning with conditional ceteris paribus preference statements
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Approximate planning for factored POMDPs using belief state simplification
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
ICML '06 Proceedings of the 23rd international conference on Machine learning
Robust Population Coding in Free-Energy-Based Reinforcement Learning
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Binary action search for learning continuous-action control policies
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Value-function-based transfer for reinforcement learning using structure mapping
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Dependency parsing with energy-based reinforcement learning
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
The Journal of Machine Learning Research
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
Solving hybrid markov decision processes
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Hi-index | 0.00 |
A novel approximation method is presented for approximating the value function and selecting good actions for Markov decision processes with large state and action spaces. The method approximates state-action values as negative free energies in an undirected graphical model called a product of experts. The model parameters can be learned efficiently because values and derivatives can be efficiently computed for a product of experts. Actions can be found even in large factored action spaces by the use of Markov chain Monte Carlo sampling. Simulation results show that the product of experts approximation can be used to solve large problems. In one simulation it is used to find actions in action spaces of size 240.