A fast data collection and augmentation procedure for object recognition

Authors:
Benjamin Sapp;Ashutosh Saxena;Andrew Y. Ng
Affiliations:
Computer Science Department, Stanford University, Stanford, CA;Computer Science Department, Stanford University, Stanford, CA;Computer Science Department, Stanford University, Stanford, CA
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Year:
2008

Citing 21
Cited 3

Blue screen matting

SIGGRAPH '96 Proceedings of the 23rd annual conference on Computer graphics and interactive techniques
Real-Time Focus Range Sensor

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Network-Based Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Parameterized Models of Image Motion

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Robust Real-Time Face Detection

International Journal of Computer Vision
Learning to Detect Objects in Images via a Sparse, Part-Based Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Object Recognition with Features Inspired by Visual Cortex

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Identifying Individuals in Video by Combining "Generative" and Discriminative Head Models

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Learning Object Categories from Google"s Image Search

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
High speed obstacle avoidance using monocular vision and reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
One-Shot Learning of Object Categories

IEEE Transactions on Pattern Analysis and Machine Intelligence
Peekaboom: a game for locating objects in images

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Cover trees for nearest neighbor

ICML '06 Proceedings of the 23rd international conference on Machine learning
Toward Category-Level Object Recognition (Lecture Notes in Computer Science)

Toward Category-Level Object Recognition (Lecture Notes in Computer Science)
Robotic Grasping of Novel Objects using Vision

International Journal of Robotics Research
Efficient training of artificial neural networks for autonomous navigation

Neural Computation
Learning methods for generic object recognition with invariance to pose and lighting

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Sharing features: efficient boosting procedures for multiclass object detection

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
The 2005 PASCAL visual object classes challenge

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
A local basis representation for estimating human pose from cluttered images

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I

Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

International Journal of Robotics Research
Robotic object detection: learning to improve the classifiers using sparse graphs for path planning

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Providing real-estate services through the integration of 3D laser scanning and building information modelling

Computers in Industry

Quantified Score

Hi-index	0.00

Visualization

Abstract

When building an application that requires object class recognition, having enough data to learn from is critical for good performance, and can easily determine the success or failure of the system. However, it is typically extremely labor-intensive to collect data, as the process usually involves acquiring the image, then manual cropping and hand-labeling. Preparing large training sets for object recognition has already become one of the main bottlenecks for such emerging applications as mobile robotics and object recognition on the web. This paper focuses on a novel and practical solution to the dataset collection problem. Our method is based on using a green screen to rapidly collect example images; we then use a probabilistic model to rapidly synthesize a much larger training set that attempts to capture desired invariants in the object's foreground and background. We demonstrate this procedure on our own mobile robotics platform, where we achieve 135x savings in the time/effort needed to obtain a training set. Our data collection method is agnostic to the learning algorithm being used, and applies to any of a large class of standard object recognition methods. Given these results, we suggest that this method become a standard protocol for developing scalable object recognition systems. Further, we used our data to build reliable classifiers that enabled our robot to visually recognize an object in an office environment, and thereby fetch an object from an office in response to a verbal request.