Parallel database systems: the future of high performance database systems
Communications of the ACM
Multidimensional binary search trees used for associative searching
Communications of the ACM
A Framework for Generating Network-Based Moving Objects
Geoinformatica
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Query and update efficient B+-tree based indexing of moving objects
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Quality and efficiency in high dimensional nearest neighbor search
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
An efficient multi-dimensional index for cloud data management
Proceedings of the first international workshop on Cloud data management
G-Store: a scalable data store for transactional multi key access in the cloud
Proceedings of the 1st ACM symposium on Cloud computing
Indexing multi-dimensional data in a cloud system
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services
MDM '11 Proceedings of the 2011 IEEE 12th International Conference on Mobile Data Management - Volume 01
Hi-index | 0.00 |
The ubiquity of location enabled devices has resulted in a wide proliferation of location based applications and services. To handle the growing scale, database management systems driving such location based services (LBS) must cope with high insert rates for location updates of millions of devices, while supporting efficient real-time analysis on latest location. Traditional DBMSs, equipped with multi-dimensional index structures, can efficiently handle spatio-temporal data. However, popular open-source relational database systems are overwhelmed by the high insertion rates, real-time querying requirements, and terabytes of data that these systems must handle. On the other hand, key-value stores can effectively support large scale operation, but do not natively provide multi-attribute accesses needed to support the rich querying functionality essential for the LBSs.We present the design and implementation of $\mathcal {MD}$ -HBase, a scalable data management infrastructure for LBSs that bridges this gap between scale and functionality. Our approach leverages a multi-dimensional index structure layered over a key-value store. The underlying key-value store allows the system to sustain high insert throughput and large data volumes, while ensuring fault-tolerance, and high availability. On the other hand, the index layer allows efficient multi-dimensional query processing. Our optimized query processing technique accesses only the index and storage level entries that intersect with the query region, thus ensuring efficient query processing. We present the design of $\mathcal {MD}$ -HBase that demonstrates how two standard index structures--the K-d tree and the Quad tree--can be layered over a range partitioned key-value store to provide scalable multi-dimensional data infrastructure. Our prototype implementation using HBase, a standard open-source key-value store, can handle hundreds of thousands of inserts per second using a modest 16 node cluster, while efficiently processing multi-dimensional range queries and nearest neighbor queries in real-time with response times as low as few hundreds of milliseconds.