Improving compression performance of block sorting coder

  • Authors:
  • Jeffrey Jones;Ryan Smith

  • Affiliations:
  • Department of Mathematics and Computer Science, SUNY Fredonia, Fredonia, NY;Department of Mathematics and Computer Science, SUNY Fredonia, Fredonia, NY

  • Venue:
  • Journal of Computing Sciences in Colleges
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

One of the most important developments in the last decade is the availability of Internet access and World Wide Web to the public. The amount of data traffic over the Internet has been increasing daily as new web sites are introduced and more people join the cyberspace. To reduce the traffic and increase the throughput, effective data compression schemes are needed.One of the recent developments in the data compression area is the Block Sorting Coder (also known as BW94, Burrows-Wheeler Compression Algorithm) technique introduced by Burrows and Wheeler [1]. We use the name Block Sorting Coder (BSC), since the name BSC is more widely in use. When applied to text or image data, BSC achieves better compression rates than Ziv-Lempel techniques with comparable speed, while its compression performance is close to context based methods, such as PPM. The BSC coder consists of three major components. The first step performs the lexical (alphabetical) sorting transformation which is widely called Burrows-Wheeler Transformation (BWT). The second major step of the BSC algorithm is Move-to-Front coder which is introduced by Bentley et al.[2]. The last step is a statistical coder, such as Huffman or Arithmetic Coder.It has been reported in [3] that while MTF coder in the second step is essential to obtain good compression for text data, it may not be needed for color-mapped images. Indeed, without MTF coder, it has been shown in [3] that 10% more compression can be achieved over Bzip (a variant of block sorting coder).Currently, due to the multimedia aspect of World Wide Web, our research has focused on improving the compression performance of block sorting coder by detecting different data types, such as audio, video and other image formats at the block sorting stage, and developing a suitable arithmetic coder. Our experimental results have shown that we can improve compression gains on some data types, such as audio, approximately 8-10% on average over Bzip.In conclusion, we are aiming to extend our research by simulating other data types, developing an improved version of Block Sorting Coder, and releasing the improved coder in the near future.