A document image model and estimation algorithm for optimized JPEG decompression

  • Authors:
  • Tak-Shing Wong;Charles A. Bouman;Ilya Pollak;Zhigang Fan

  • Affiliations:
  • School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN;School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN;School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN;Xerox Research and Technology, Xerox Corporation, Webster, NY

  • Venue:
  • IEEE Transactions on Image Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

The JPEG standard is one of the most prevalent image compression schemes in use today. While JPEG was designed for use with natural images, it is also widely used for the encoding of raster documents. Unfortunately, JPEG's characteristic blocking and ringing artifacts can severely degrade the quality of text and graphics in complex documents. We propose a JPEG decompression algorithm which is designed to produce substantially higher quality images from the same standard JPEG encodings. The method works by incorporating a document image model into the decoding process which accounts for the wide variety of content in modern complex color documents. The method works by first segmenting the JPEG encoded document into regions corresponding to background, text, and picture content. The regions corresponding to text and background are then decoded using maximum a posteriori (MAP) estimation. Most importantly, the MAP reconstruction of the text regions uses a model which accounts for the spatial characteristics of text and graphics. Our experimental comparisons to the baseline JPEG decoding as well as to three other decoding schemes, demonstrate that our method substantially improves the quality of decoded images, both visually and as measured by PSNR.