An Affine Invariant Interest Point Detector
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Multiple View Geometry in Computer Vision
Multiple View Geometry in Computer Vision
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Object Categorization by Learned Universal Visual Dictionary
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
Universal and Adapted Vocabularies for Generic Visual Categorization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Supervised Learning of Quantizer Codebooks by Information Loss Minimization
IEEE Transactions on Pattern Analysis and Machine Intelligence
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
Descriptor learning for efficient retrieval
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Hi-index | 0.00 |
The state of the art for large database object retrieval in images is based on quantizing descriptors of interest points into visual words. High similarity between matching image representations (as bags of words) is based upon the assumption that matched points in the two images end up in similar words in hard assignment or in similar representations in soft assignment techniques. In this paper we study how ground truth correspondences can be used to generate better visual vocabularies. Matching of image patches can be done e.g. using deformable models or from estimating 3D geometry. For optimization of the vocabulary, we propose minimizing the entropies of soft assignment of points. We base our clustering on hierarchical k-splits. The results from our entropy based clustering are compared with hierarchical k-means. The vocabularies have been tested on real data with decreased entropy and increased true positive rate, as well as better retrieval performance.