Use of off-line dynamic programming for efficient image interpretation

Authors:
Ramana Isukapalli;Russell Greiner
Affiliations:
Lucent Technologies, Holmdel, NJ;Department of Computing Science, University of Alberta, Edmonton, AB, Canada
Venue:
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Year:
2003

Citing 3
Cited 2

Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Eigenfaces for recognition

Journal of Cognitive Neuroscience

Learning to detect objects of many classes using binary classifiers

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Learning a dynamic classification method to detect faces and identify facial expression

AMFG'05 Proceedings of the Second international conference on Analysis and Modelling of Faces and Gestures

Quantified Score

Hi-index	0.00

Visualization

Abstract

An interpretation system finds the likely mappings from portions of an image to real-world objects. An interpretation policy specifies when to apply which imaging operator, to which portion of the image, during every stage of interpretation. Earlier results compared a number of policies, and demonstrated that policies that select operators which maximize the information gain per cost, worked most effectively. However, those policies are myopic -- they rank the operators based only on their immediate rewards. This can lead to inferior overall results: it may be better to use a relatively expensive operator first, if that operator provides information that will significantly reduce the cost of the subsequent operators. This suggests using some lookahead process to compute the quality for operators non-myopically. Unfortunately, this is prohibitively expensive for most domains, especially for domains that have a large number of complex states. We therefore use ideas from reinforcement learning to compute the utility of each operator sequence. In particular, our system first uses dynamic programming, over abstract simplifications of interpretation states, to precompute the utility of each relevant sequence. It does this off-line, over a training sample of images. At run time, our interpretation system uses these estimates to decide when to use which imaging operator. Our empirical results, in the challenging real-world domain of face recognition, demonstrate that this approach works more effectively than myopic approaches.