A visual approach for video geocoding using bag-of-scenes

  • Authors:
  • Otávio A. B. Penatti;Lin Tzy Li;Jurandy Almeida;Ricardo da S. Torres

  • Affiliations:
  • University of Campinas, Campinas, SP, Brazil;University of Campinas, Campinas, SP, Brazil and Telecommunications Res. & Dev. Center, Campinas, SP, Brazil;University of Campinas, Campinas, SP, Brazil;University of Campinas, Campinas, SP, Brazil

  • Venue:
  • Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel approach for video representation, called bag-of-scenes. The proposed method is based on dictionaries of scenes, which provide a high-level representation for videos. Scenes are elements with much more semantic information than local features, specially for geotagging videos using visual content. Thus, each component of the representation model has self-contained semantics and, hence, it can be directly related to a specific place of interest. Experiments were conducted in the context of the MediaEval 2011 Placing Task. The reported results show our strategy compared to those from other participants that used only visual content to accomplish this task. Despite our very simple way to generate the visual dictionary, which has taken photos at random, the results show that our approach presents high accuracy relative to the state-of-the art solutions.