Learning methods for generic object recognition with invariance to pose and lighting

Authors:
Yann LeCun;Fu Jie Huang;Léon Bottou
Affiliations:
The Courant Institute, New York University, New York, NY;The Courant Institute, New York University, New York, NY;NEC Labs America, Princeton, NJ
Venue:
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Year:
2004

Citing 11
Cited 43

Visual learning and recognition of 3-D objects from appearance

International Journal of Computer Vision
SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition

Neural Computation
Local Grayvalue Invariants for Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Network-Based Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Support Vector Machines for 3D Object Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Contour and Texture Analysis for Image Segmentation

International Journal of Computer Vision
Learning a Sparse Representation for Object Detection

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Shape Models and Object Recognition

Shape, Contour and Grouping in Computer Vision
Training Support Vector Machines: an Application to Face Detection

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Probabilistic visual learning for object detection

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
A statistical approach to 3d object detection applied to faces and cars

A statistical approach to 3d object detection applied to faces and cars

Generic Object Recognition with Boosting

IEEE Transactions on Pattern Analysis and Machine Intelligence
One-Shot Learning of Object Categories

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Critical View of Context

International Journal of Computer Vision
Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition

International Journal of Computer Vision
Synergistic Face Detection and Pose Estimation with Energy-Based Models

The Journal of Machine Learning Research
Robust Object Recognition with Cortex-Like Mechanisms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sharing Visual Features for Multiclass and Multiview Object Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
An empirical evaluation of deep architectures on problems with many factors of variation

Proceedings of the 24th international conference on Machine learning
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
Describing Visual Scenes Using Transformed Objects and Parts

International Journal of Computer Vision
Multilevel Image Coding with Hyperfeatures

International Journal of Computer Vision
Deep learning from temporal coherence in video

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Using fast weights to improve persistent contrastive divergence

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A fast data collection and augmentation procedure for object recognition

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Combining structural descriptions and image-based representations for image, object, and scene recognition

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning

International Journal of Computer Vision
Invariant object recognition using circular pairwise convolutional networks

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Color object recognition in real-world scenes

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
A convolutional learning system for object classification in 3-D lidar data

IEEE Transactions on Neural Networks
Visual object-action recognition: Inferring object affordances from human demonstration

Computer Vision and Image Understanding
Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Evaluation of pooling operations in convolutional architectures for object recognition

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Convolutional learning of spatio-temporal features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Multiple viewpoint recognition and localization

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part I
A time-frequency convolutional neural network for the offline classification of steady-state visual evoked potential responses

Pattern Recognition Letters
On fast deep nets for AGI vision

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Restricted deep belief networks for multi-view learning

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Efficiency optimization of trainable feature extractors for a consumer platform

ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Hyperfeatures – multilevel local coding for visual recognition

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Object recognition with statistically independent features: a model inspired by the primate visual cortex

RoboCup 2009
Learning intermediate-level representations of form and motion from natural movies

Neural Computation
Flexible, high performance convolutional neural networks for image classification

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Object recognition using a bio-inspired neuron model with bottom-up and top-down pathways

Neurocomputing
The contribution of context information: A case study of object recognition in an intelligent car

Neurocomputing
An efficient learning procedure for deep boltzmann machines

Neural Computation
Suitability of V1 energy models for object classification

Neural Computation
Multi-scale convolutional neural networks for natural scene license plate detection

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
Self-learning classification of radar features for scene understanding

Robotics and Autonomous Systems
A fast illumination and deformation insensitive image comparison algorithm using wavelet-based geodesics

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Learning temporal coherent features through life-time sparsity

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Nonparametric guidance of autoencoder representations using label information

The Journal of Machine Learning Research
From machine learning to machine reasoning

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We assess the applicability of several popular learning methods for the problem of recognizing generic visual categories with invariance to pose, lighting, and surrounding clutter. A large dataset comprising stereo image pairs of 50 uniform-colored toys under 36 azimuths, 9 elevations, and 6 lighting conditions was collected (for a total of 194,400 individual images). The objects were 10 instances of 5 generic categories: four-legged animals, human figures, airplanes, trucks, and cars. Five instances of each category were used for training, and the other five for testing. Low-resolution grayscale images of the objects with various amounts of variability and surrounding clutter were used for training and testing. Nearest Neighbor methods, Support Vector Machines, and Convolutional Networks, operating on raw pixels or on PCA-derived features were tested. Test error rates for unseen object instances placed on uniform backgrounds were around 13% for SVM and 7% for Convolutional Nets. On a segmentation/recognition task with highly cluttered images, SVM proved impractical, while Convolutional nets yielded 16/7% error. A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second.