Compression of whole genome alignments

  • Authors:
  • Pavol Hanus;Janis Dingel;Georg Chalkidis;Joachim Hagenauer

  • Affiliations:
  • Department of Electrical Engineering and Information Technology, Technische Universität München, München, Germany;Department of Electrical Engineering and Information Technology, Technische Universität München, München, Germany;Department of Electrical Engineering and Information Technology, Technische Universität München, München, Germany;Department of Electrical Engineering and Information Technology, Technische Universität München, München, Germany

  • Venue:
  • IEEE Transactions on Information Theory - Special issue on information theory in molecular biology and neuroscience
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent advances in DNA sequencing technology have caused an exponential growth of publicly available genomic sequence data. A particularly voluminous, frequently used static data set are whole genome alignments. The first lossless compression algorithm for such data sets based on well-established statistical evolutionary models and prediction techniques from lossless binary image compression is introduced. The compression rate is improved by a factor of 1.6 compared to the currently used Lempel-Ziv (LZ) compression.