An approach to phrase selection for offline data compression

  • Authors:
  • A. Turpin;W. F. Smyth

  • Affiliations:
  • Curtin University of Technology, Perth, Western Australia, 6845;Curtin University of Technology, Perth, Western Australia, 6845

  • Venue:
  • ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently several offline data compression schemes have been published that expend large amounts of computing resources when encoding a file, but decode the file quickly. These compressors work by identifying phrases in the input data, and storing the data as a series of pointer to these phrases. This paper explores the application of an algorithm for computing all repeating substrings within a string for phrase selection in an offline data compressor. Using our approach, we obtain compression similar to that of the best known offline compressors on genetic data, but poor results on general text. It seems, however, that an alternate approach based on selecting repeating substrings is feasible.