Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Parallel data intensive computing in scientific and commercial applications
Parallel Computing - Parallel data-intensive algorithms and applications
Aggregate-Query Processing in Data Warehousing Environments
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Answering Queries with Aggregation Using Views
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Data-intensive e-science frontier research
Communications of the ACM - Blueprint for the future of high-performance networking
Applying Database Support for Large Scale Data Driven Science in Distributed Environments
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Petascale Computational Systems
Computer
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Compute and storage clouds using wide area high performance networks
Future Generation Computer Systems
IEEE Intelligent Systems
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Future Generation Computer Systems
A framework for distributed knowledge management: Design and implementation
Future Generation Computer Systems
MapReduce: a flexible data processing tool
Communications of the ACM - Amir Pnueli: Ahead of His Time
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads
Proceedings of the VLDB Endowment
Toward dynamic and attribute based publication, discovery and selection for cloud computing
Future Generation Computer Systems
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
VDB-MR: MapReduce-based distributed data integration using virtual database
Future Generation Computer Systems
Selecting and using views to compute aggregate queries
ICDT'05 Proceedings of the 10th international conference on Database Theory
Adapting scientific computing problems to clouds using MapReduce
Future Generation Computer Systems
HSim: A MapReduce simulator in enabling Cloud Computing
Future Generation Computer Systems
Hi-index | 0.00 |
Ad-hoc Aggregate query is extremely important for query intensive applications in cloud computing which extracts valuable summary information on massive datasets to help the decision-maker make right decisions. Current data storage schemes (row-store and column-store) cannot efficiently answer ad-hoc aggregate query on massive data sets in cloud computing. A new data storage structure (bit vector storage structure, bit-store for short) is proposed in this paper. The paper focuses on proposing ad-hoc aggregate query algorithms based on bit-store. Firstly, the storage model of bit-store including its attribute encoding schemes and bit file organization is introduced. Secondly, different aggregate operations for query processing are presented based on different encoding schemes. Thirdly, cost analysis for different aggregate operations is presented. Finally, the effectiveness and efficiency of the proposed algorithms is showed by the analytical and experimental results.