Assigning geographical scopes to web pages

  • Authors:
  • Bruno Martins;Marcirio Chaves;Mário J. Silva

  • Affiliations:
  • Departamento de Informática da, Faculdade de Ciências da, Universidade de Lisboa, Lisboa, Portugal;Departamento de Informática da, Faculdade de Ciências da, Universidade de Lisboa, Lisboa, Portugal;Departamento de Informática da, Faculdade de Ciências da, Universidade de Lisboa, Lisboa, Portugal

  • Venue:
  • ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding automatic ways of attaching geographical scopes to on-line resources, also called “geo-referencing” documents, is a challenging problem, getting increasing attention [1,5,3]. Here we present a system architecture and a process for identifying the geographical scope of Web pages, defining a scope as the region where more people than average would find that page relevant. We rely on typical Web IR heuristics (i.e. feature weighting, hypertext topic locality, anchor description) and assumptions on how people use geographical references in documents. The method involves three major steps. First, geographical named entities are identified in the text. Next, we propagate the found named entities through the Web linkage graph. Finally, a geographical ontology is used to disambiguate among the named entities associated to a document, this way selecting the most likely scope. In the future, we plan on using scopes in new location-aware search tools.