Natural gradient works efficiently in learning
Neural Computation
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Regression with input-dependent noise: a Gaussian process treatment
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Gradient Convergence in Gradient methods with Errors
SIAM Journal on Optimization
A Taxonomy of Global Optimization Methods Based on Response Surfaces
Journal of Global Optimization
Risk-Sensitive Reinforcement Learning
Machine Learning
Q-Learning for Risk-Sensitive Control
Mathematics of Operations Research
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Most likely heteroscedastic Gaussian process regression
Proceedings of the 24th international conference on Machine learning
Using Gaussian Processes to Optimize Expensive Functions
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Machine learning for fast quadrupedal locomotion
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Automatic gait optimization with Gaussian process regression
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Robot weightlifting by direct policy search
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Bayesian optimization for sensor set selection
Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks
From Motor Learning to Interaction Learning in Robots
From Motor Learning to Interaction Learning in Robots
Whole-body strategies for mobility and manipulation
Whole-body strategies for mobility and manipulation
Convergence Rates of Efficient Global Optimization Algorithms
The Journal of Machine Learning Research
IEEE Transactions on Neural Networks
An experimental methodology for response surface optimization methods
Journal of Global Optimization
Hi-index | 0.00 |
We present new global and local policy search algorithms suitable for problems with policy-dependent cost variance (or risk), a property present in many robot control tasks. These algorithms exploit new techniques in non-parametric heteroscedastic regression to directly model the policy-dependent distribution of cost. For local search, the learned cost model can be used as a critic for performing risk-sensitive gradient descent. Alternatively, decision-theoretic criteria can be applied to globally select policies to balance exploration and exploitation in a principled way, or to perform greedy minimization with respect to various risk-sensitive criteria. This separation of learning and policy selection permits variable risk control, where risk-sensitivity can be flexibly adjusted and appropriate policies can be selected at runtime without relearning. We describe experiments in dynamic stabilization and manipulation with a mobile manipulator that demonstrate learning of flexible, risk-sensitive policies in very few trials.