Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
STR: A Simple and Efficient Algorithm for R-Tree Packing
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
A Generic Approach to Bulk Loading Multidimensional Index Structures
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Efficient Bulk Operations on Dynamic R-trees
ALENEX '99 Selected papers from the International Workshop on Algorithm Engineering and Experimentation
Master-Client R-Trees: A New Parallel R-Tree Architecture
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Parallel bulk-loading of spatial data
Parallel Computing - Special issue: High performance computing with geographical data
The Priority R-tree: a practically efficient and worst-case optimal R-tree
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A revised r*-tree in comparison with related index structures
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Experiences on Processing Spatial Data with MapReduce
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Leveraging Cloud Computing in Geodatabase Management
GRC '10 Proceedings of the 2010 IEEE International Conference on Granular Computing
Proceedings of the 2010 Workshop on Parallel Programming Patterns
Sort-based query-adaptive loading of R-trees
Proceedings of the 21st ACM international conference on Information and knowledge management
Parallel spatial query processing on GPUs using R-trees
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
Hi-index | 0.00 |
Due to the increasing amount of spatial data, parallel algorithms for processing big spatial data become more and more important. In particular, the shared nothing architecture is attractive as it offers low cost data processing. Moreover, popular MapReduce frameworks such as Hadoop allow developing conceptually simple and scalable algorithms for processing big data using this architecture. In this work we address the problem of parallel loading of R-trees on a shared-nothing platform. The R-tree is a key element for efficient query processing in large spatial database, but its creation is expensive. We proposed a novel scalable parallel loading algorithm for MapReduce. The core of our parallel loading is the state of the art sequential sort-based query-adaptive R-tree loading algorithm that builds R-trees optimized according to a commonly used cost model. In contrast to previous methods for loading R-trees with MapReduce we construct the R-tree level-wise. Our experimental results show an almost linear speedup in the number of machines. Moreover, the resulting R-trees provide a better query performance than R-trees build by other competitive bulk-loading algorithms.