When Pyramids Learned Walking

  • Authors:
  • Walter G. Kropatsch

  • Affiliations:
  • PRIP, Vienna University of Technology, Austria

  • Venue:
  • CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A temporal image sequence increases the dimension of the data by simply stacking images above each other. This further raises the computational complexity of the processes. The typical content of a pixel or a voxel is its grey or color value. With some processing, features and fitted model parameters are added. In a pyramid these values are repeatedly summarized in the stack of images or image descriptions with a constant factor of reduction. From this derives their efficiency of allowing $\log(\mbox{diameter})$ complexity for global information transmission. Content propagates bottom-up by reduction functions like inheritance or filters. Content propagates top-down by expansion functions like interpolation or projection. Moving objects occlude different parts of the image background. Computing one pyramid per frame needs lots of bottom-up computation and very complex and time consuming updating. In the new concept we propose one pyramid per object and one pyramid for the background. The connection between both is established by coordinates that are coded in the pyramidal cells much like in a Laplacian pyramid or a wavelet. We envision that this code will be stored in each cell and will be invariant to the basic movements of the object. All the information about position and orientation of the object is concentrated in the apex. New positions are calculated for the apex and can be accurately reconstructed for every cell in a top-down process. At the new pixel locations the expected content can be verified by comparing it with the actual image frame.