MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
Leveraging a scalable row store to build a distributed text index
Proceedings of the first international workshop on Cloud data management
Distributed indexing of web scale datasets for the cloud
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Hi-index | 0.00 |
Currently, cloud databases serve as mainstream data storage mechanism for unstructured data, primarily because of their high scalability and ease of availability. However, as yet, they lag behind RDBMs in terms of their support to developers for querying the data. The problem of developing frameworks to support flexible data queries is a very active area of research. In this work we consider HBase, a popular cloud database, inspired by Google's BigTable. Relying on the recent Coprocessor feature of HBase, we have developed a framework that developers can use to implement aggregate functions like row count, max, min, etc. We further extended the existing Coprocessor framework to support a Cursor functionality, so that a client can incrementally consume the Coprocessor generated result. We demonstrate the effectiveness of our extension by comparatively evaluating it against the existing Scanner API with four queries on three different data sets.