Towards a more discriminative and semantic visual vocabulary

  • Authors:
  • R. J. López-Sastre;T. Tuytelaars;F. J. Acevedo-Rodríguez;S. Maldonado-Bascón

  • Affiliations:
  • University of Alcalá, Department of Signal Theory and Communications, GRAM, 28805 Alcalá de Henares, Spain;Catholic University of Leuven, ESAT/PSI-VISICS/IBBT, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium;University of Alcalá, Department of Signal Theory and Communications, GRAM, 28805 Alcalá de Henares, Spain;University of Alcalá, Department of Signal Theory and Communications, GRAM, 28805 Alcalá de Henares, Spain

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel method for constructing a visual vocabulary that takes into account the class labels of images, thus resulting in better recognition performance and more efficient learning. Our method consists of two stages: Cluster Precision Maximisation (CPM) and Adaptive Refinement. In the first stage, a Reciprocal Nearest Neighbours (RNN) clustering algorithm is guided towards class representative visual words by maximising a new cluster precision criterion. As we are able to optimise the vocabulary without the need for expensive cross-validation, the overall training time is significantly reduced without a negative impact on the results. Next, an adaptive threshold refinement scheme is proposed with the aim of increasing vocabulary compactness while at the same time improving the recognition rate and further increasing the representativeness of the visual words for category-level object recognition. This is a correlation clustering based approach, which works as a meta-clustering and optimises the cut-off threshold for each cluster separately. In the experiments we analyse the recognition rate of different vocabularies for a subset of the Caltech 101 dataset, showing how RNN in combination with CPM selects the optimal codebooks, and how the clustering refinement step succeeds in further increasing the recognition rate.