Matching sets of features for efficient retrieval and recognition

Authors:
Trevor Darrell;Kristen Lorraine Grauman
Affiliations:
Massachusetts Institute of Technology;Massachusetts Institute of Technology
Venue:
Matching sets of features for efficient retrieval and recognition
Year:
2006

Citing 0
Cited 7

The Pyramid Match Kernel: Efficient Learning with Sets of Features

The Journal of Machine Learning Research
The pyramid match: efficient learning with partial correspondences

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Local feature hashing for face recognition

BTAS'09 Proceedings of the 3rd IEEE international conference on Biometrics: Theory, applications and systems
An improved pyramid matching kernel

LSMS/ICSEE'10 Proceedings of the 2010 international conference on Life system modeling and and intelligent computing, and 2010 international conference on Intelligent computing for sustainable energy and environment: Part I
Large-scale EMM identification based on geometry-constrained visual word correspondence voting

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Local histograms of character N-grams for authorship attribution

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Spatial pyramid formulation in weakly supervised manner

ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In numerous domains it is useful to represent a single example by the collection of local features or parts that comprise it. In computer vision in particular, local image features are a powerful way to describe images of objects and scenes. Their stability under variable image conditions is critical for success in a wide range of recognition and retrieval applications. However, many conventional similarity measures and machine learning algorithms assume vector inputs. Comparing and learning from images represented by sets of local features is therefore challenging, since each set may vary in cardinality and its elements lack a meaningful ordering. In this thesis I present computationally efficient techniques to handle comparisons, learning, and indexing with examples represented by sets of features. The primary goal of this research is to design and demonstrate algorithms that can effectively accommodate this useful representation in a way that scales with both the representation size as well as the number of images available for indexing or learning. I introduce the pyramid match algorithm, which efficiently forms an implicit partial matching between two sets of feature vectors. The matching has a linear time complexity, naturally forms a Mercer kernel, and is robust to clutter or outlier features, a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. I provide bounds on the expected error relative to the optimal partial matching. For very large databases; even extremely efficient pairwise comparisons may not offer adequately responsive query times. I show how to perform sub-linear time retrievals under the matching measure with randomized hashing techniques, even when input sets have varying numbers of features. My results are focused on several important vision tasks, including applications to content-based image retrieval, discriminative classification for object recognition, kernel regression, and unsupervised learning of categories. I show how the dramatic increase in performance enables accurate and flexible image comparisons to be made on large-scale data sets, and removes the need to artificially limit the number of local descriptions used per image when learning visual categories.