Minimization methods for non-differentiable functions
Minimization methods for non-differentiable functions
ALVINN: an autonomous land vehicle in a neural network
Advances in neural information processing systems 1
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning structured prediction models: a large margin approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
Logarithmic regret algorithms for online convex optimization
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Learning for control from multiple demonstrations
Proceedings of the 25th international conference on Machine learning
Apprenticeship learning using linear programming
Proceedings of the 25th international conference on Machine learning
Imitation Learning Using Graphical Models
ECML '07 Proceedings of the 18th European conference on Machine Learning
A bayesian reinforcement learning approach for customizing human-robot interfaces
Proceedings of the 14th international conference on Intelligent user interfaces
A survey of robot learning from demonstration
Robotics and Autonomous Systems
Apprenticeship learning for helicopter control
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Maximum entropy inverse reinforcement learning
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Training parsers by inverse reinforcement learning
Machine Learning
CHOMP: gradient optimization techniques for efficient motion planning
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Following directions using statistical machine translation
Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Bundle Methods for Regularized Risk Minimization
The Journal of Machine Learning Research
Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain
International Journal of Robotics Research
Learning from demonstration using MDP induced metrics
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Autonomous Helicopter Aerobatics through Apprenticeship Learning
International Journal of Robotics Research
Optimization and learning for rough terrain legged locomotion
International Journal of Robotics Research
The Stanford LittleDog: A learning and rapid replanning approach to quadruped locomotion
International Journal of Robotics Research
Inverse Reinforcement Learning in Partially Observable Environments
The Journal of Machine Learning Research
Probabilistic pointing target prediction via inverse optimal control
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
Structured Learning and Prediction in Computer Vision
Foundations and Trends® in Computer Graphics and Vision
Imitation learning in relational domains: a functional-gradient boosting approach
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Batch, off-policy and model-free apprenticeship learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Perceptron models for online structured prediction
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Bayesian nonparametric inverse reinforcement learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Structured apprenticeship learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Human behavior understanding for robotics
HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
Learning the combinatorial structure of demonstrated behaviors with inverse feedback control
HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
Apprenticeship learning with few examples
Neurocomputing
Legibility and predictability of robot motion
Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction
A policy-blending formalism for shared control
International Journal of Robotics Research
CHOMP: Covariant Hamiltonian optimization for motion planning
International Journal of Robotics Research
Reinforcement learning in robotics: A survey
International Journal of Robotics Research
Bayesian nonparametric feature construction for inverse reinforcement learning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Familiarization to robot motion
Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction
Hi-index | 0.00 |
Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior. Further, we demonstrate a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference. Although the technique is general, it is particularly relevant in problems where A* and dynamic programming approaches make learning policies tractable in problems beyond the limitations of a QP formulation. We demonstrate our approach applied to route planning for outdoor mobile robots, where the behavior a designer wishes a planner to execute is often clear, while specifying cost functions that engender this behavior is a much more difficult task.