Learning to Detect Objects in Images via a Sparse, Part-Based Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
One-Shot Learning of Object Categories
IEEE Transactions on Pattern Analysis and Machine Intelligence
High-Performance Rotation Invariant Multiview Face Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
LabelMe: A Database and Web-Based Tool for Image Annotation
International Journal of Computer Vision
Semantic object classes in video: A high-definition ground truth database
Pattern Recognition Letters
International Journal of Computer Vision
Cutting-plane training of structural SVMs
Machine Learning
Object Categorization: Computer and Human Vision Perspectives
Object Categorization: Computer and Human Vision Perspectives
Monocular Pedestrian Detection: Survey and Experiments
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision
EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
Object Detection with Discriminatively Trained Part-Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.10 |
Unlike many other object recognition datasets which provide either category-level or within-category annotations, we introduce a novel dataset called ''IAIR-CarPed'' with layered semantic labels ranging from categories to fine-grained subcategories. These labels are collected from 20 subjects via strict psychophysical experiments. To the best of our knowledge, it is the first time that an object recognition dataset is built in this way to represent the adaptive and in-depth interpretations of objects in human vision. This dataset focuses on ''car'' and ''pedestrian'' which are two representative categories important in real applications. It contains 3132 images collected from pictures taken under various conditions and 8567 objects carefully annotated by all the 20 subjects. Besides fine-grained and layered semantic labels, five types of detailed visual difficulties of these objects are also provided, which can be adopted to evaluate the representation and generalization abilities of the recognition systems against individual difficulties. We present here the details of building this dataset, its statistics and properties, and then discuss possible applications of it with some primary experimental results.