Computing geographical serving area based on search logs and website categorization

  • Authors:
  • Qi Zhang;Xing Xie;Lee Wang;Lihua Yue;Wei-Ying Ma

  • Affiliations:
  • Department of CS, University of Science and Technology of China, Hefei, P.R. China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Department of CS, University of Science and Technology of China, Hefei, P.R. China;Microsoft Research Asia, Beijing, China

  • Venue:
  • DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowing the geographical serving area of web resources is very important for many web applications. Here serving area stands for the geographical distribution of online users who are interested in a given web site. In this paper, we proposed a set of novel methods to detect the serving area of web resources by analyzing search engine logs. We use the search logs to detect serving area in two ways. First, we extracted the user IP locations to generate the geographical distribution of users who had the same interests in a web site. Second, query terms input by users were considered as the user knowledge about a web site. To increase the confidence and to cover new sites for use in real-time applications, we also proposed a categorization system for local web sites. A novel method for detecting the serving area was proposed based on categorizing the web content. For each category, a radius was assigned according to previous logs. In our experiments, we evaluated all these three algorithms. From the results, we found that the approach based on query terms was superior to that based on IP locations, since search queries for local sites tended to include location words while the IP locations were sometimes erroneous. The approach based on categorization was efficient for sites of known categories and were useful for small sites without sufficient number of query logs.