CG_Hadoop: computational geometry in MapReduce

  • Authors:
  • Ahmed Eldawy;Yuan Li;Mohamed F. Mokbel;Ravi Janardan

  • Affiliations:
  • University of Minnesota, Twin Cities;University of Minnesota, Twin Cities;University of Minnesota, Twin Cities and Umm Al-Qura University, Makkah, Saudi Arabia;University of Minnesota, Twin Cities

  • Venue:
  • Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hadoop, employing the MapReduce programming paradigm, has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not truly exploited towards processing large-scale computational geometry operations. This paper introduces CG_Hadoop; a suite of scalable and efficient MapReduce algorithms for various fundamental computational geometry problems, namely, polygon union, skyline, convex hull, farthest pair, and closest pair, which present a set of key components for other geometric algorithms. For each computational geometry operation, CG_Hadoop has two versions, one for the Apache Hadoop system and one for the SpatialHadoop system; a Hadoop-based system that is more suited for spatial operations. These proposed algorithms form a nucleus of a comprehensive MapReduce library of computational geometry operations. Extensive experimental results on a cluster of 25 machines of datasets up to 128GB show that CG_Hadoop achieves up to 29x and 260x better performance than traditional algorithms when using Hadoop and SpatialHadoop systems, respectively.