Modeling locations with social media

  • Authors:
  • Neil O'Hare;Vanessa Murdock

  • Affiliations:
  • Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain

  • Venue:
  • Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we focus on the locations explicit and implicit in users descriptions of their surroundings. We propose a statistical language modeling approach to identifying locations in arbitrary text, and investigate several ways to estimate the models, based on the term frequency and the user frequency. The geotagged public photos in Flickr serve as a convenient ground truth. Our results show that we can predict location within a one聽kilometer by one聽kilometer cell with 17聽% accuracy, and within a three聽kilometer radius around such a one聽kilometer cell with 40聽% accuracy, using only a photo's tags. This is significantly better than the state of the art. Further we examine several estimation strategies that leverage the physical proximity of places, and show that for sparsely represented locations, smoothing from the immediate neighborhood improves results. We also show that estimation strategies based on user frequency are much more reliable than approaches based on the raw term frequency.