On metonymy recognition for geographic information retrieval

  • Authors:
  • Johannes Leveling;Sven Hartrumpf

  • Affiliations:
  • Intelligent Information and Communication Systems (IICS), FernUniversität in Hagen (University of Hagen), 58084 Hagen, Germany;Intelligent Information and Communication Systems (IICS), FernUniversität in Hagen (University of Hagen), 58084 Hagen, Germany

  • Venue:
  • International Journal of Geographical Information Science
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Metonymically used location names (toponyms) refer to other, related entities and thus possess a meaning different from their literal, geographic sense. Metonymic uses are to be treated differently to improve the performance of geographic information retrieval (GIR). Statistics on toponym senses show that 75.06% of all location names are used in their literal sense, 17.05% are used metonymically, and 7.89% have a mixed sense. This article presents a method for disambiguating location names in texts between literal and metonymic senses, based on shallow features. The evaluation of this method is two-fold. First, we use a memory-based learner (TiMBL) to train a classifier and determine standard evaluation measures such as F-score and accuracy. The classifier achieved an F-score of 0.842 and an accuracy of 0.846 for identifying toponym senses in a subset of the CoNLL (Conference on Natural Language Learning) data. Second, we perform retrieval experiments based on the GeoCLEF data (newspaper article corpus and queries) from 2005 and 2006. We compare searching location names in a database index containing both their literal and metonymic senses with searching in an index containing their literal senses only. Evaluation results indicate that removing metonymic senses from the index yields a higher mean average precision (MAP) for GIR. In total, we observed a significant gain in MAP: an increase from 0.0704 to 0.0715 MAP for the GeoCLEF 2005 data, and an increase from 0.1944 to 0.2100 MAP for the GeoCLEF 2006 data.