Query-aware compression of join results

  • Authors:
  • Christopher M. Mullins;Lipyeow Lim;Christian A. Lang

  • Affiliations:
  • University of Hawai'i at Mānoa, Honolulu, HI;University of Hawai'i at Mānoa, Honolulu, HI;Acelot Inc. Santa Barbara, CA

  • Venue:
  • Proceedings of the 16th International Conference on Extending Database Technology
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Client-server database query processing has become an important paradigm in many data processing applications today. In cloud-based data services, for example, queries over structured data are sent to cloud-based servers for processing and the results relayed back to the client devices. Network bandwidth between client devices and cloud-based servers is often a limited resource and the use of data compression to reduce the amount of query result data transmitted would not only conserve bandwidth but also help with battery lifetime in the case of mobile client devices. For query result compression, current data compression methods do not exploit redundancy information that can be inferred from the query structure itself for greater compression. In this paper we propose a novel query-aware compression method for compressing query results sent from database servers to client applications. Our method is based on two key ideas. We exploit redundancy information obtained from the query plan and possibly from the database schema to achieve better compression than standard non-query aware compressors. We use a collection of memory-limited dictionaries to encode attribute values in a lightweight and efficient manner. Each dictionary in the collection of dictionaries are also dynamically resized to adapt to changing temporal access characteristics. We evaluated our method empirically using the TPC-H benchmark show that this technique is effective especially when used in conjunction with standard compressors. Our results show that compression ratios of up to twice that of gzip are possible.