Use of off-line dynamic programming for efficient image interpretation

  • Authors:
  • Ramana Isukapalli;Russell Greiner

  • Affiliations:
  • Lucent Technologies, Holmdel, NJ;Department of Computing Science, University of Alberta, Edmonton, AB, Canada

  • Venue:
  • IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

An interpretation system finds the likely mappings from portions of an image to real-world objects. An interpretation policy specifies when to apply which imaging operator, to which portion of the image, during every stage of interpretation. Earlier results compared a number of policies, and demonstrated that policies that select operators which maximize the information gain per cost, worked most effectively. However, those policies are myopic -- they rank the operators based only on their immediate rewards. This can lead to inferior overall results: it may be better to use a relatively expensive operator first, if that operator provides information that will significantly reduce the cost of the subsequent operators. This suggests using some lookahead process to compute the quality for operators non-myopically. Unfortunately, this is prohibitively expensive for most domains, especially for domains that have a large number of complex states. We therefore use ideas from reinforcement learning to compute the utility of each operator sequence. In particular, our system first uses dynamic programming, over abstract simplifications of interpretation states, to precompute the utility of each relevant sequence. It does this off-line, over a training sample of images. At run time, our interpretation system uses these estimates to decide when to use which imaging operator. Our empirical results, in the challenging real-world domain of face recognition, demonstrate that this approach works more effectively than myopic approaches.