Multi-modal, multi-resource methods for placing Flickr videos on the map

  • Authors:
  • Pascal Kelm;Sebastian Schmiedeke;Thomas Sikora

  • Affiliations:
  • Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany

  • Venue:
  • Proceedings of the 1st ACM International Conference on Multimedia Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present three approaches for placing videos in Flickr on the world map. The toponym extraction and geo lookup approach makes use of external resources to identify toponyms in the metadata and associate them with geo-coordinates. The metadata-based region model approach uses a k-nearest-neighbour classifier trained over geographical regions. Videos are represented using their metadata in a text space with reduced dimensionality. The visual region model approach uses a support vector machine also trained over geographical regions. Videos are represented using low-level feature vectors from multiple key frames. Voting methods are used to form a single decision for each video. We compare the approaches experimentally, highlighting the importance of using appropriate metadata features and suitable regions as the basis of the region model. The best performance is achieved by the geo-lookup approach used with fallback to the visual region model when the video metadata contains no toponym.