Searching web documents as location sets

  • Authors:
  • Marco D. Adelfio;Sarana Nutanong;Hanan Samet

  • Affiliations:
  • University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD

  • Venue:
  • Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A geographic search system named GeoXLS is presented, which enables users to submit a set of locations as a query object Q and to find documents containing locations similar to those in Q. Search results come from a collection of geotagged web documents, specifically a vast collection of spreadsheets obtained from the Web. The results are ranked according to their similarity to Q, using one of several user-selected similarity measures related to the Hausdorff distance. GeoXLS allows users to answer queries such as "I know the locations of n entities of type X. What sets of data contain points similar to my query points?" For example, given a set Q of known impact craters, find documents that contain locations similar to those in Q and beyond. In essence, this allows someone to "complete the set" by identifying sets containing similar locations. GeoXLS provides capabilities analogous to a standard keyword search engine, but with keywords specified geographically. In contrast to a search engine that handles only text queries, our geographic search system is capable of returning search result documents that are not exact matches to the query. For example, searching with query points in "Washington, DC", "Denver, Colorado", and "Chicago, Illinois" could return documents related to colleges with actual locations in "College Park, Maryland", "Boulder, Colorado", and "Evanston, Illinois", which are similar spatially, but not textually. GeoXLS can be useful in a wide variety of knowledge domains where the data can be represented as a collection of point sets.