Substituting outline fonts for bitmap fonts in archived PDF files

  • Authors:
  • S. G. Probets;D. F. Brailsford

  • Affiliations:
  • Department of Information Science, Loughborough University, Loughborough, Leicestershire LE11 3TU, U.K.;School of Computer Science, University of Nottingham, Jubilee Campus, Nottingham NG8 1BB, U.K.

  • Venue:
  • Software—Practice & Experience
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

As collections of archived digital documents continue to grow the maintenance of an archive, and the quality of reproduction from the archived format, become important long-term considerations. In particular, Adobe's portable document format (PDF) is now an important 'final form' standard for archiving and distributing electronic versions of technical documents. It is important that all embedded images in the PDF, and any fonts used for text rendering, should at the very minimum be easily readable on screen. Unfortunately, because PDF is based on PostScript technology, it allows the embedding of bitmap fonts in Adobe Type 3 format as well as higher-quality outline fonts in TrueType or Adobe Type 1 formats. Bitmap fonts do not generally perform well when they are scaled and rendered on low-resolution devices such as workstation screens.The work described here investigates how a plug-in to Adobe Acrobat enables bitmap fonts to be substituted by corresponding outline fonts using a checksum matching technique against a canonical set of bitmap fonts, as originally distributed. The target documents for our initial investigations are those PDF files produced by (LA)TEX systems when set up in a default (bitmap font) configuration. For all bitmap fonts where recognition exceeds a certain confidence threshold replacement fonts in Adobe Type 1 (outline) format can be substituted with consequent improvements in file size, screen display quality and rendering speed. The accuracy of font recognition is discussed together with the prospects of extending these methods to bitmap-font PDF files from sources other than (LA)TEX.