Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns
IEEE Transactions on Pattern Analysis and Machine Intelligence
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Multilevel Image Coding with Hyperfeatures
International Journal of Computer Vision
Efficient object category recognition using classemes
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Improving the fisher kernel for large-scale image classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Efficient highly over-complete sparse coding using a mixture model
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Image classification using super-vector coding of local image descriptors
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
lp-Norm Multiple Kernel Learning
The Journal of Machine Learning Research
Discriminative affine sparse codes for image classification
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Discriminative spatial pyramid
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Beyond spatial pyramids: Receptive field learning for pooled image features
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Visual word disambiguation by semantic contexts
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Ask the locals: Multi-way local pooling for image recognition
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
A graph-matching kernel for object categorization
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Visual Event Recognition in Videos by Learning from Web Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
We introduce a new framework for image classification that extends beyond the window sampling of fixed spatial pyramids to include a comprehensive set of windows densely sampled over location, size and aspect ratio. To effectively deal with this large set of windows, we derive a concise high-level image feature using a two-level extraction method. At the first level, window-based features are computed from local descriptors (e.g., SIFT, spatial HOG, LBP) in a process similar to standard feature extractors. Then at the second level, the new image feature is determined from the window-based features in a manner analogous to the first level. This higher level of abstraction offers both efficient handling of dense samples and reduced sensitivity to misalignment. More importantly, our simple yet effective framework can readily accommodate a large number of existing pooling/coding methods, allowing them to extract features beyond the spatial pyramid representation. To effectively fuse the second level feature with a standard first level image feature for classification, we additionally propose a new learning algorithm, called Generalized Adaptive ℓp-norm Multiple Kernel Learning (GA-MKL), to learn an adapted robust classifier based on multiple base kernels constructed from image features and multiple sets of pre-learned classifiers of all the classes. Extensive evaluation on the object recognition (Caltech256) and scene recognition (15Scenes) benchmark datasets demonstrates that the proposed method outperforms state-of-the-art image classification algorithms under a broad range of settings.