Computational Statistics & Data Analysis - Nonlinear methods and data mining
Disambiguating Geographic Names in a Historical Digital Library
ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Bootstrapping toponym classifiers
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Robust location search from text queries
Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems
Decision trees as possibilistic classifiers
International Journal of Approximate Reasoning
Geographic intention and modification in web search
International Journal of Geographical Information Science
Ontology-Based spatial query expansion in information retrieval
OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, COA, and ODBASE - Volume Part II
Map search via a factor graph model
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Local search is a specialization of the web search that allows users to submit geographically constrained queries. However, one of the challenges for local search engines is to uniquely understand and locate the geographical intent of the query. Geographical constraints (or location references) in a local search are often incomplete and thereby suffer from the referent ambiguity problem where the same location name can mean several different possibilities. For instance, just the term "Springfield" by itself can refer to 30 different cities in the USA. Previous approaches to location disambiguation have generally been hand compiled heuristic models. In this paper, we examine a data-driven, machine learning approach to location disambiguation. Essentially, we separately train a Gradient Boosted Decision Tree (GBDT) model on thousands of desktop and mobile-based local searches and compare the performance to one of our previous heuristic based location disambiguation system (HLDS). The GBDT based approach shows promising results with statistically significant improvements over the HLDS approach. The error rate reduction is about 9% and 22% for the desktop-based and the mobile-based local searches respectively. Additionally, we examine the relative influence of various geographic and non-geographic features that help with the location disambiguation task. It is interesting to note that while the distance between the user and the intended location has been considered as an important variable, the relative influence of distance is secondary to the popularity of the location in the GBDT learnt models.