Machine Learning
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
A Performance Evaluation of Local Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluating bag-of-visual-words representations in scene classification
Proceedings of the international workshop on Workshop on multimedia information retrieval
A comparison of color features for visual concept classification
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
The MIR flickr retrieval evaluation
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Kernel Codebooks for Scene Categorization
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Automated Flower Classification over a Large Number of Classes
ICVGIP '08 Proceedings of the 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing
New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative
Proceedings of the international conference on Multimedia information retrieval
Online Learning for Matrix Factorization and Sparse Coding
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
Improving the fisher kernel for large-scale image classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Efficient highly over-complete sparse coding using a mixture model
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Image classification using super-vector coding of local image descriptors
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Discriminative affine sparse codes for image classification
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Matching pursuits with time-frequency dictionaries
IEEE Transactions on Signal Processing
Greed is good: algorithmic results for sparse approximation
IEEE Transactions on Information Theory
In defense of soft-assignment coding
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Ask the locals: Multi-way local pooling for image recognition
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
A graph-matching kernel for object categorization
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Hi-index | 0.00 |
Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are extracted from images, they are expressed as vectors representing visual word content, referred to as mid-level features. In this paper, we review a number of techniques for generating mid-level features, including two variants of Soft Assignment, Locality-constrained Linear Coding, and Sparse Coding. We also isolate the underlying properties that affect their performance. Moreover, we investigate various pooling methods that aggregate mid-level features into vectors representing images. Average pooling, Max-pooling, and a family of likelihood inspired pooling strategies are scrutinised. We demonstrate how both coding schemes and pooling methods interact with each other. We generalise the investigated pooling methods to account for the descriptor interdependence and introduce an intuitive concept of improved pooling. We also propose a coding-related improvement to increase its speed. Lastly, state-of-the-art performance in classification is demonstrated on Caltech101, Flower17, and ImageCLEF11 datasets.