Demonstration of Hadoop-GIS: a spatial data warehousing system over MapReduce

  • Authors:
  • Ablimit Aji;Xiling Sun;Hoang Vo;Qioaling Liu;Rubao Lee;Xiaodong Zhang;Joel Saltz;Fusheng Wang

  • Affiliations:
  • Emory University;Northwestern University;Emory University;Emory University;Ohio State University;Ohio State University;Emory University;Emory University

  • Venue:
  • Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The proliferation of GPS-enabled devices, and the rapid improvement of scientific instruments have resulted in massive amounts of spatial data in the last decade. Support of high performance spatial queries on large volumes data has become increasingly important in numerous fields, which requires a scalable and efficient spatial data warehousing solution as existing approaches exhibit scalability limitations and efficiency bottlenecks for large scale spatial applications. In this demonstration, we present Hadoop-GIS -- a scalable and high performance spatial query system over MapReduce. Hadoop-GIS provides an efficient spatial query engine to process spatial queries, data and space based partitioning, and query pipelines that parallelize queries implicitly on MapReduce. Hadoop-GIS also provides an expressive, SQL-like spatial query language for work-load specification. We will demonstrate how spatial queries are expressed in spatially extended SQL queries, and submitted through a command line/web interface for execution. Parallel to our system demonstration, we explain the system architecture and details on how queries are translated to MapReduce operators, optimized, and executed on Hadoop. In addition, we will showcase how the system can be used to support two representative real world use cases: large scale pathology analytical imaging, and geo-spatial data warehousing.