Experiments with geo-filtering predicates for IR

  • Authors:
  • Jochen L. Leidner

  • Affiliations:
  • Linguit GmbH, Bad Bergzabern, Germany

  • Venue:
  • CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a set of experiments for monolingual English retrieval at Geo-CLEF 2005, evaluating a technique for spatial retrieval based on named entity tagging, toponym resolution, and re-ranking by means of geographic filtering. To this end, a series of systematic experiments in the Vector Space paradigm are presented. Plain bag-of-words versus phrasal retrieval and the potential of meronymy query expansion as a recall-enhancing device are investigated, and three alternative geo-spatial filtering techniques based on spatial clipping are compared and evaluated on 25 monolingual English queries. Preliminary results show that always choosing toponym referents based on a simple “maximum population” heuristic to approximate the salience of a referent fails to outperform TF*IDF baselines with the Geo-CLEF 2005 dataset when combined with three geo-filtering predicates. Conservative geo-filtering outperforms more aggressive predicates. The evidence further seems to suggest that query expansion with WordNet meronyms is not effective in combination with the method described. A post-hoc analysis indicates that responsible factors for the low performance include sparseness of available population data, gaps in the gazetteer that associates Minimum Bounding Rectangles with geo-terms in the query, and the composition of the Geo-CLEF 2005 dataset itself.