How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex?

Authors:
Yongqiang Cao;Stephen Grossberg;Jeffrey Markowitz
Affiliations:
-;-;-
Venue:
Neural Networks
Year:
2011

Citing 7
Cited 5

A massively parallel architecture for a self-organizing neural pattern recognition machine

Computer Vision, Graphics, and Image Processing
Self-organization and associative memory: 3rd edition

Self-organization and associative memory: 3rd edition
Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system

Neural Networks
Adaptive 3-D Object Recognition from Multiple Views

IEEE Transactions on Pattern Analysis and Machine Intelligence - Special issue on interpretation of 3-D scenes—part II
Fast-learning VIEWNET architectures for recognizing three-dimensional objects from multiple two-dimensional views

Neural Networks - Special issue: automatic target recognition
2007 Special Issue: Consciousness CLEARS the mind

Neural Networks
Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps

IEEE Transactions on Neural Networks

Foundations and new paradigms of brain computing: past, present, and future

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
On the road to invariant recognition: Explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning

Neural Networks
Stereopsis and 3D surface perception by spiking neurons in laminar cortical circuits: A method for converting neural rate models into spiking models

Neural Networks
A bio-inspired kinematic controller for obstacle avoidance during reaching tasks with real robots

Neural Networks
Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans.