ACM Computing Surveys (CSUR)
ACM SIGMOD Record
Min-max compression methods for medical image databases
ACM SIGMOD Record
The implementation and performance of compressed databases
ACM SIGMOD Record
Query optimization in compressed database systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Database Management Systems
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
Approximate Query Answering Using Data Warehouse Striping
Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
Text Compression for Dynamic Document Databases
IEEE Transactions on Knowledge and Data Engineering
Squeezing the Most Out of Relational Database Systems
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
An efficient compression code for text databases
ECIR'03 Proceedings of the 25th European conference on IR research
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Hi-index | 0.00 |
This paper proposes the compression of data in Relational Database Management Systems (RDBMS) using existing text compression algorithms. Although the technique proposed is general, we believe it is particularly advantageous for the compression of medium size and large dimension tables in data warehouses. In fact, dimensions usually have a high number of text attributes and a reduction in their size has a big impact in the execution time of queries that join dimensions with fact tables. In general, the high complexity and long execution time of most data warehouse queries make the compression of dimension text attributes (and possible text attributes that may exist in the fact table, such as false facts) an effective approach to speed up query response time. The proposed approach has been evaluated using the well-known TPC-H benchmark and the results show that speed improvements greater than 40% can be achieved for most of the queries.