Computing partial sums in multidimensional arrays
SCG '89 Proceedings of the fifth annual symposium on Computational geometry
A new data structure for cumulative frequency tables
Software—Practice & Experience
Range queries in OLAP data cubes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
A Lower Bound on the Complexity of Orthogonal Range Queries
Journal of the ACM (JACM)
The Complexity of Maintaining an Array and Computing Its Partial Sums
Journal of the ACM (JACM)
Introduction to algorithms
CRB-Tree: An Efficient Indexing Scheme for Range-Aggregate Queries
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Incremental computation and maintenance of temporal aggregates
The VLDB Journal — The International Journal on Very Large Data Bases
Spatiotemporal Aggregate Computation: A Survey
IEEE Transactions on Knowledge and Data Engineering
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A dichromatic framework for balanced trees
SFCS '78 Proceedings of the 19th Annual Symposium on Foundations of Computer Science
Database Systems: The Complete Book
Database Systems: The Complete Book
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Column-oriented database systems
Proceedings of the VLDB Endowment
Positional update handling in column stores
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Enterprise Application-Specific Data Management
EDOC '10 Proceedings of the 2010 14th IEEE International Enterprise Distributed Object Computing Conference
Hi-index | 0.00 |
Run-length encoding is a popular compression scheme which is used extensively to compress the attribute values in column stores. Out of order insertion of tuples potentially degrades the compression achieved using run-length encoding and consequently, the performance of reads. The in-place insertions, deletions and updates of tuples into a column store relation with n tuples take O(n) time. The linear cost is typically avoided by amortizing the cost of updates in batches. However, the relation is decompressed and subsequently re-compressed after applying a batch of updates. This leads to added time time complexity. We propose a novel indexing scheme called count indexes that supports O(log n) in-place insertions, deletions, updates and look ups on a run-length encoded sequence with n runs. We also show that count indexes efficiently update a batch of tuples requiring almost a constant time per updated tuple. Additionally, we show that count indexes are optimal. We extend count indexes to support O(log n) updates on bitmapped sequences with n values and adapt them to block-based stores.