Biologically inspired Cartesian and non-Cartesian filters for attentional sequences

  • Authors:
  • H. I. Bozma;G. Çakiroğlu;Ç. Soyer

  • Affiliations:
  • Intelligent Systems Laboratory, Department of Electric Electronic Engineering, Institute of Biomedical Engineering, Bogaziçi University, Bebek 80815, Istanbul, Turkey;Intelligent Systems Laboratory, Department of Electric Electronic Engineering, Institute of Biomedical Engineering, Bogaziçi University, Bebek 80815, Istanbul, Turkey;Intelligent Systems Laboratory, Department of Electric Electronic Engineering, Institute of Biomedical Engineering, Bogaziçi University, Bebek 80815, Istanbul, Turkey

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2003

Quantified Score

Hi-index 0.10

Visualization

Abstract

The aim of this paper is to develop a rich set of visual primitives that can be used by a camera-endowed robot as it is exploring a scene and thus generating an attentional sequence--spatio-temporally related sets of visual features. Our starting point is inspired by the work of Gallant et al. on the area V4 response of the macaque monkeys to Cartesian and non-Cartesian stimuli. The novelty of these stimuli is that in addition to the conventional sinusoidal gratings, they also include non-Cartesian stimuli such as circular, polar and hyperbolic gratings. Based on this stimulus set and introducing frequency as a parameter, we obtain a rich set of visual primitives. These visual primitives are biologically motivated, nearly orthogonal with some degree of redundancy, can be made complete as required and yet implementable on off-the-shelf hardware for real-time selective vision-robot applications. Attentional sequences are then formed as a spatio-temporal sequence of observations--each of which encodes the filter responses of each fovea as an observation vector consisting of responses of 50 filters. A series of experiments serve to demonstrate the use of these visual primitives in attention-based real-life scene recognition tasks: (1) modeling complex scenes based on average attentional sequence responses and (2) fast real-time recognition of relatively complex scenes with a few saccades-- based on the comparison of the current attentional sequence to the a priori learned average observation vectors.