Extracting focused locations for web pages

  • Authors:
  • Qingqing Zhang;Peiquan Jin;Sheng Lin;Lihua Yue

  • Affiliations:
  • University of Science and Technology of China, China;University of Science and Technology of China, China;University of Science and Technology of China, China;University of Science and Technology of China, China

  • Venue:
  • WAIM'11 Proceedings of the 2011 international conference on Web-Age Information Management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most Web pages contain location information, which can be used to improve the effectiveness of search engines. In this paper, we concentrate on the focused locations, which refer to the most appropriate locations associated with Web pages. Current algorithms suffer from the ambiguities among locations, as many different locations share the same name (known as GEO/GEO ambiguity), and some locations have the same name with non-geographical entities such as person names (known as GEO/NON-GEO ambiguity). In this paper, we first propose a new algorithm named GeoRank, which employs a similar idea with PageRank to resolve the GEO/GEO ambiguity. We also introduce some heuristic rules to eliminate the GEO/NON-GEO ambiguity. After that, an algorithm with dynamic parameters to determine the focused locations is presented. We conduct experiments on two real datasets to evaluate the performance of our approach. The experimental results show that our algorithm outperforms the state-of-the-art methods in both disambiguation and focused locations determination.