Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Brains, Behavior and Robotics
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Information-theoretic metric learning
Proceedings of the 24th international conference on Machine learning
Analyzing feature generation for value-function approximation
Proceedings of the 24th international conference on Machine learning
A primal-dual perspective of online learning algorithms
Machine Learning
The Journal of Machine Learning Research
Model-based function approximation in reinforcement learning
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Structured metric learning for high dimensional problems
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Autonomous transfer for reinforcement learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Transferring Instances for Model-Based Reinforcement Learning
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Safe exploration of state and action spaces in reinforcement learning
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
A key component of any reinforcement learning algorithm is the underlying representation used by the agent. While reinforcement learning (RL) agents have typically relied on hand-coded state representations, there has been a growing interest in learning this representation. While inputs to an agent are typically fixed (i.e., state variables represent sensors on a robot), it is desirable to automatically determine the optimal relative scaling of such inputs, as well as to diminish the impact of irrelevant features. This work introduces Holler, a novel distance metric learning algorithm, and combines it with an existing instance-based RL algorithm to achieve precisely these goals. The algorithms' success is highlighted via empirical measurements on a set of six tasks within the mountain car domain.