Computational geometry: an introduction
Computational geometry: an introduction
Spatial tessellations: concepts and applications of Voronoi diagrams
Spatial tessellations: concepts and applications of Voronoi diagrams
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Partition based spatial-merge join
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data Partitioning for Parallel Spatial Join Processing
Geoinformatica
Parallel Processing of Spatial Joins Using R-trees
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Scalable Sweeping-Based Spatial Join
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
An Evaluation of Generic Bulk Loading Techniques
Proceedings of the 27th International Conference on Very Large Data Bases
ACM Transactions on Database Systems (TODS)
Voronoi-based K nearest neighbor search for spatial network databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The V*-Diagram: a query-dependent approach to moving KNN queries
Proceedings of the VLDB Endowment
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Experiences on Processing Spatial Data with MapReduce
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
MapReduce: a flexible data processing tool
Communications of the ACM - Amir Pnueli: Ahead of His Time
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
Communications of the ACM
VoR-tree: R-trees with Voronoi diagrams for efficient processing of spatial nearest neighbor queries
Proceedings of the VLDB Endowment
YSmart: Yet Another SQL-to-MapReduce Translator
ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
SkewTune: mitigating skew in mapreduce applications
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
IEEE Transactions on Information Technology in Biomedicine
Accelerating pathology image data cross-comparison on CPU-GPU hybrid systems
Proceedings of the VLDB Endowment
Towards Parallel Spatial Query Processing for Big Spatial Data
IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Demonstration of Hadoop-GIS: a spatial data warehousing system over MapReduce
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
A parallel spatial data analysis infrastructure for the cloud
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Robust and efficient polygon overlay on parallel stream processors
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Hadoop GIS: a high performance spatial data warehousing system over mapreduce
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the "big data" challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce.