Cost-sensitive reinforcement learning for adaptive classification and control

Authors:
Ming Tan
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Year:
1991

Citing 8
Cited 0

Inference of finite automata using homing sequences

STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
Active perception and reinforcement learning

Proceedings of the seventh international conference (1990) on Machine learning
Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Cost-sensitive robot learning

Cost-sensitive robot learning
Induction of Decision Trees

Machine Learning
Dynamic Programming

Dynamic Programming
STRIPS: a new approach to the application of theorem proving to problem solving

IJCAI'71 Proceedings of the 2nd international joint conference on Artificial intelligence
Noise-tolerant instance-based learning algorithms

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Stadard reinforcement learning methods assume they can identify each state distinctly before making an action decision. In reality, a robot agent only has a limited sensing capability and identifying each state by extensive sensing can be time consuming. This paper describes an approach that learns active perception strategies in reinforcement learning and considers sensing costs explicitly. The approach integrates cost-sensitive learning with reinforcement learning to learn an efficient internal state representation and a decision policy simultaneously in a finite, deterministic environment. It not only maximizes the long-term discounted reward per action but also reduces the average sensing cost per state. The initial experimental results in a simulated robot navigation domain are encouraging.