ItCompress: An Iterative Semantic Compression Algorithm

  • Authors:
  • H. V. Jagadish;Raymond T. Ng;Beng Chin Ooi;Anthony K. H. Tung

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDE '04 Proceedings of the 20th International Conference on Data Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real datasets are often large enough to necessitate datacompression. Traditional 'syntactic' data compression methodstreat the table as a large byte string and operate at thebyte level. The tradeoff in such cases is usually between theease of retrieval (the ease with which one can retrieve a singletuple or attribute value without decompressing a much largerunit) and the effectiveness of the compression. In this regard,the use of semantic compression has generated considerableinterest and motivated certain recent works.In this paper, we propose a semantic compression algorithmcalled ItCompress ITerative Compression, whichachieves good compression while permitting access even atattribute level without requiring the decompression of a largerunit. ItCompress iteratively improves the compression ratioof the compressed output during each scan of the table. Theamount of compression can be tuned based on the number ofiterations. Moreover, the initial iterations provide significantcompression, thereby making it a cost-effective compressiontechnique. Extensive experiments were conducted and the resultsindicate the superiority of ItCompress with respect topreviously known tehniques, such as 'SPARTAN' and 'fascicles'.