Detecting geographical references in the form of place names and associated spatial natural language

  • Authors:
  • Jochen L. Leidner;Michael D. Lieberman

  • Affiliations:
  • Thomson Reuters Global Resources, Catalyst Lab, Neuhofstrasse, Baar, Switzerland;University of Maryland

  • Venue:
  • SIGSPATIAL Special
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recognizing spatial language in text documents, termed geoparsing, is useful for many applications, because together with mapping such language to lat/long values, also known as geocoding, it enables the connection of the unstructured textual realm with the structured realm of Geographic Information Systems (GIS) [11]. For example, news stories about events happening in a particular location can be explored on a map for a spatial understanding of these events, as implemented by applications like the European Media Monitor (EMM) [18] and NewsStand [13, 20]. Web pages, blogs, encyclopedia articles, news stories, tweets and travel reports can all benefit from such interlinking with maps, which requires the recognition of spatial language. Note that geoparsing can be considered as a more specific application of the task of Named Entity Recognition and Classification (NERC): NERC is concerned with automatically recognizing proper nouns of any kind, often meant to include monetary amounts, dates, and other types, while geoparsing is the NERC task applied to locations specifically. Geoparsing is also known by many names in the literature, including geotagging, georecognition, and toponym recognition, but for consistency, here we will refer only to geoparsing. In this paper, we provide an overview of the challenges related to geoparsing, several families of geoparsing methods, existing systems and data collections available for performing geoparsing, and open research questions related to geoparsing.