Evaluation of active learning strategies for video indexing

  • Authors:
  • Stéphane Ayache;Georges Quénot

  • Affiliations:
  • Laboratoire d'Informatique de Grenoble BP 53, Grenoble Cedex 9, France;Laboratoire d'Informatique de Grenoble BP 53, Grenoble Cedex 9, France

  • Venue:
  • Image Communication
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we compare active learning strategies for indexing concepts in video shots. Active learning is simulated using subsets of a fully annotated data set instead of actually calling for user intervention. Training is done using the collaborative annotation of 39 concepts of the TRECVID 2005 campaign. Performance is measured on the 20 concepts selected for the TRECVID 2006 concept detection task. The simulation allows exploring the effect of several parameters: the strategy, the annotated fraction of the data set, the size of the data set, the number of iterations and the relative difficulty of concepts. Three strategies were compared. The first two, respectively, select the most probable and the most uncertain samples. The third one is a random choice. For easy concepts, the ''most probable'' strategy is the best one when less than 15% of the data set is annotated and the ''most uncertain'' strategy is the best one when 15% or more of the data set is annotated. The ''most probable'' and ''most uncertain'' strategies are roughly equivalent for moderately difficult and difficult concepts. In all cases, the maximum performance is reached when 12-15% of the whole data set is annotated. This result is, however, dependent upon the step size and the training set size. One-fortieth of the training set size is a good value for the step size. The size of the subset of the training set that has to be annotated in order to reach the maximum achievable performance varies with the square root of the training set size. The ''most probable'' strategy is more ''recall oriented'' and the ''most uncertain'' strategy is more ''precision oriented''.