1994 Special Issue: Modeling visual recognition from neurobiological constraints

  • Authors:
  • Mike W. Oram;David I. Perrett

  • Affiliations:
  • -;-

  • Venue:
  • Neural Networks - Special issue: models of neurodynamics and behavior
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

Neurobiological data from the cerebral cortex of the macaque monkey suggest a model of object recognition that is a series of four computational stages. These are executed in seven major hierarchically arranged areas of processing, each area with an input and an output layer of cells. The first computational stage occurs within early visual cortex and involves the first two cortical areas. Here it appears that boundaries between image regions and logical groupings of local oriented image elements that ''belong'' together are computed. These processes segregate image attributes that can then be treated as arising from the same object. The next three visual cortical areas execute the second computational stage and display sensitivity to an ever increasing complexity and variety of visual shape features (e.g., T junctions, concentric rings, spotted triangle shape). The third stage of processing seems to utilize combinations of these shape features to establish selectivity to what we refer to as object-feature instances (i.e., the approximate appearance of a small number of object attributes seen under particular viewing conditions). Cells in these areas tolerate change in position but show only limited generalization for change in retinal size, orientation, or perspective view. The fourth computational process occurs within the final cortical areas and gives rise to cell selectivity showing object constancy across size and orientation. This process probably occurs through pooling of the outputs of cells responsive to different instances of the same object view. Importantly, constancy across perspective view (i.e., the transition between viewer-centred and object-centred representation) does not seem to be completed except by a small percentage of cells. Synaptic changes encompassing various associative (e.g., Hebbian) and non-associative (e.g., decorrelating) procedures may allow cells throughout the stages of processing to become tuned to frequently experienced image attributes, shapes, and objects. Associative learning procedures operating over short time periods may underlie the progressive generalization over changing viewing conditions. Constancy across position, orientation, size, and, finally, perspective view and object parts is established slowly as each area pools the appropriate outputs of the less specific cells in the preceding area. After such learning procedures, the visual system can operate to resolve the appearance of unexpected objects primarily in a feedforward manner, without the need for lateral inhibition or feedback loops, a property few models embody. This feedforward processing does not deny the possibility that top-down influences, although poorly understood, may play a role in nulling image aspects that are predictable in appearance and/or not the object of attention such that only features containing relevant discriminatory information are processed further.