Data Compression in Scientific and Statistical Databases
IEEE Transactions on Software Engineering
Data compression and Gray-code sorting
Information Processing Letters
Multiattribute hashing using Gray codes
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Using multiset discrimination to solve language processing problems without hashing
Theoretical Computer Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A Survey of Combinatorial Gray Codes
SIAM Review
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Improving performance of sparse matrix-vector multiplication
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Block-Oriented Compression Techniques for Large Statistical Databases
IEEE Transactions on Knowledge and Data Engineering
Data Compression Support in Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Hilbert R-tree: An Improved R-tree using Fractals
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Reclustering of High Energy Physics Data
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Byte-aligned bitmap compression
DCC '95 Proceedings of the Conference on Data Compression
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Fast and accurate traffic matrix measurement using adaptive cardinality counting
Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Tuples and Permutations (Art of Computer Programming)
Optimizing bitmap indices with efficient compression
ACM Transactions on Database Systems (TODS)
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
How to barter bits for chronons: compression and bandwidth trade offs for database scans
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Mixed-Radix Gray Codes in Lee Metric
IEEE Transactions on Computers
Compressing table data with column dependency
Theoretical Computer Science
Compressing large boolean matrices using reordering techniques
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A comparison of five probabilistic view-size estimation techniques in OLAP
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Index compression is good, especially for random access
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Compact Hilbert indices: Space-filling curves for domains with unequal side lengths
Information Processing Letters
The SBC-tree: an index for run-length compressed sequences
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Pruning attribute values from data cubes with diamond dicing
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Read-optimized databases, in depth
Proceedings of the VLDB Endowment
The bipancycle-connectivity of the hypercube
Information Sciences: an International Journal
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Locality and bounding-box quality of two-dimensional space-filling curves
Computational Geometry: Theory and Applications
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Efficient index compression in DB2 LUW
Proceedings of the VLDB Endowment
Speeding up queries in column stores: a case for compression
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Conditional edge-fault-tolerant Hamiltonicity of dual-cubes
Information Sciences: an International Journal
Attribute value reordering for efficient hybrid OLAP
Information Sciences: an International Journal
Run-length encodings (Corresp.)
IEEE Transactions on Information Theory
A hilbert space compression architecture for data warehouse environments
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Variable length compression for bitmap indices
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Colour image coding with matching pursuit in the spatio-frequency domain
ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing: Part I
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Processing a trillion cells per mouse click
Proceedings of the VLDB Endowment
Minimizing index size by reordering rows and columns
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Increasing the efficiency of quicksort using a neural network based algorithm selection model
Information Sciences: an International Journal
Hi-index | 0.08 |
Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right order before sorting can reduce the number of runs by a factor of two or more. Unfortunately, determining the best column order is NP-hard. For many cases, we prove that the number of runs in table columns is minimized if we sort columns by increasing cardinality. Experimentally, sorting based on Hilbert space-filling curves is poor at minimizing the number of runs.