The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
ACM Transactions on Database Systems (TODS)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Main-memory index structures with fixed-size partial keys
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Query optimization in compressed database systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Main Memory Database Systems: An Overview
IEEE Transactions on Knowledge and Data Engineering
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Compressing Relations and Indexes
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Proceedings of the 17th International Conference on Data Engineering
Cache Conscious Indexing for Decision-Support in Main Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Performance tradeoffs in read-optimized databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Cache-efficient string sorting using copying
Journal of Experimental Algorithmics (JEA)
HAT-trie: a cache-conscious trie-based data structure for strings
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Buffering accesses to memory-resident index structures
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Cache-Conscious collision resolution in string hash tables
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Probabilistic ranking over relations
Proceedings of the 13th International Conference on Extending Database Technology
Graph-based concept identification and disambiguation for enterprise search
Proceedings of the 19th international conference on World wide web
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MOSS-DB: a hardware-aware OLAP database
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Database compression on graphics processors
Proceedings of the VLDB Endowment
Multi-core vs. I/O wall: the approaches to conquer and cooperate
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Designing fast architecture-sensitive tree search on modern multicore/many-core processors
ACM Transactions on Database Systems (TODS)
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Toward efficient querying of compressed network payloads
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Compacting transactional data in hybrid OLTP&OLAP databases
Proceedings of the VLDB Endowment
BitWeaving: fast scans for main memory data processing
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
DEMO: Adjustably encrypted in-memory column-store
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Optimal re-encryption strategy for joins in encrypted databases
DBSec'13 Proceedings of the 27th international conference on Data and Applications Security and Privacy XXVII
Hi-index | 0.00 |
Column-oriented database systems [19, 23] perform better than traditional row-oriented database systems on analytical workloads such as those found in decision support and business intelligence applications. Moreover, recent work [1, 24] has shown that lightweight compression schemes significantly improve the query processing performance of these systems. One such a lightweight compression scheme is to use a dictionary in order to replace long (variable-length) values of a certain domain with shorter (fixedlength) integer codes. In order to further improve expensive query operations such as sorting and searching, column-stores often use order-preserving compression schemes. In contrast to the existing work, in this paper we argue that orderpreserving dictionary compression does not only pay off for attributes with a small fixed domain size but also for long string attributes with a large domain size which might change over time. Consequently, we introduce new data structures that efficiently support an order-preserving dictionary compression for (variablelength) string attributes with a large domain size that is likely to change over time. The main idea is that we model a dictionary as a table that specifies a mapping from string-values to arbitrary integer codes (and vice versa) and we introduce a novel indexing approach that provides efficient access paths to such a dictionary while compressing the index data. Our experiments show that our data structures are as fast as (or in some cases even faster than) other state-of-the-art data structures for dictionaries while being less memory intensive.