Location disambiguation in local searches using gradient boosted decision trees

  • Authors:
  • Ritesh J. Agrawal;James G. Shanahan

  • Affiliations:
  • AT&T Interactive, San Francisco, CA;AT&T Interactive, San Francisco, CA

  • Venue:
  • Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Local search is a specialization of the web search that allows users to submit geographically constrained queries. However, one of the challenges for local search engines is to uniquely understand and locate the geographical intent of the query. Geographical constraints (or location references) in a local search are often incomplete and thereby suffer from the referent ambiguity problem where the same location name can mean several different possibilities. For instance, just the term "Springfield" by itself can refer to 30 different cities in the USA. Previous approaches to location disambiguation have generally been hand compiled heuristic models. In this paper, we examine a data-driven, machine learning approach to location disambiguation. Essentially, we separately train a Gradient Boosted Decision Tree (GBDT) model on thousands of desktop and mobile-based local searches and compare the performance to one of our previous heuristic based location disambiguation system (HLDS). The GBDT based approach shows promising results with statistically significant improvements over the HLDS approach. The error rate reduction is about 9% and 22% for the desktop-based and the mobile-based local searches respectively. Additionally, we examine the relative influence of various geographic and non-geographic features that help with the location disambiguation task. It is interesting to note that while the distance between the user and the intended location has been considered as an important variable, the relative influence of distance is secondary to the popularity of the location in the GBDT learnt models.