Microarray image compression: SLOCO and the effect of information loss

  • Authors:
  • R. Jornsten;W. Wang;B. Yu;K. Ramchandran

  • Affiliations:
  • Department of Statistics, Rutgers University;Department of Electrical Engineering, University of California, Berkeley, 211 Cory Hall # 1772, Berkeley, CA;Department of Statistics, University of California, Berkeley;Department of Electrical Engineering, University of California, Berkeley, 211 Cory Hall # 1772, Berkeley, CA

  • Venue:
  • Signal Processing - Special issue: Genomic signal processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray image technology is a powerful tool for monitoring the expression of thousands of genes simultaneously. Each microarray experiment produces immense amounts of image data, and efficient storage and transmission require compression that takes advantage of microarray image structure. In this paper we develop a compression scheme for microarray images which can be either lossless or lossy with successive refinements. Existing measures of distortion such as mean squared pixel-wise error and visual fidelity are not appropriate for microarray images. We introduce a new measure of distortion for lossy compression: the sensitivity of microarray information extraction to compression loss. Furthermore, our scheme has a coded data structure that allows fast decoding and reprocessing of image sub-blocks, and includes summary statistics and image segmentation information. The average lossless compression ratio is 1.83:1 for our cDNA test images and 2.43:1 for our inkjet test images, comparable or better than state-of-the-art lossless schemas, yet with additional structure and information. At an average lossy compression ratio of 8:1 for cDNA microarrays, we find that our scheme minimizes the effects of compression loss compared to other algorithms. We show that the variability in differential gene expression levels extracted from lossily vs. losslessly compressed microarray images is less than both the variability between different arrays and the variability between different extraction algorithms. In fact, lossy compression can improve the estimation of gene expression levels for cDNA images.