Parallel database systems: the future of high performance database systems
Communications of the ACM
Stochastic processes
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
P-Grid: a self-organizing structured P2P system
ACM SIGMOD Record
Farsite: federated, available, and reliable storage for an incompletely trusted environment
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BATON: a balanced tree structure for peer-to-peer networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Speeding up search in peer-to-peer networks with a multi-way tree structure
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
P-ring: an efficient and robust P2P range index structure
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Sinfonia: a new paradigm for building scalable distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
A practical scalable distributed B-tree
Proceedings of the VLDB Endowment
Indexing multi-dimensional data in a cloud system
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The impact of virtualization on network performance of amazon EC2 data center
INFOCOM'10 Proceedings of the 29th conference on Information communications
SNFS: the design and implementation of a social network file system
Proceedings of the 4th Workshop on Social Network Systems
Providing scalable database services on the cloud
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Efficient parallel kNN joins for large data in MapReduce
Proceedings of the 15th International Conference on Extending Database Technology
Minuet: a scalable distributed multiversion B-tree
Proceedings of the VLDB Endowment
On saying "enough already!" in MapReduce
Proceedings of the 1st International Workshop on Cloud Intelligence
Robust distributed indexing for locality-skewed workloads
Proceedings of the 21st ACM international conference on Information and knowledge management
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Database research at the National University of Singapore
ACM SIGMOD Record
An index model for multitenant data storage in saas
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
A multi-dimensional index structure based on improved VA-file and CAN in the cloud
International Journal of Automation and Computing
The Journal of Supercomputing
Hi-index | 0.00 |
A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud, efficient data management is needed to handle huge volumes of data and support a large number of concurrent end users. To achieve that, a scalable and high-throughput indexing scheme is generally required. Such an indexing scheme must not only incur a low maintenance cost but also support parallel search to improve scalability. In this paper, we present a novel, scalable B+-tree based indexing scheme for efficient data processing in the Cloud. Our approach can be summarized as follows. First, we build a local B+-tree index for each compute node which only indexes data residing on the node. Second, we organize the compute nodes as a structured overlay and publish a portion of the local B+-tree nodes to the overlay for efficient query processing. Finally, we propose an adaptive algorithm to select the published B+-tree nodes according to query patterns. We conduct extensive experiments on Amazon's EC2, and the results demonstrate that our indexing scheme is dynamic, efficient and scalable.