Optical font recognition using conditional random field

  • Authors:
  • Aziza Satkhozhina;Ildus Ahmadullin;Jan P. Allebach

  • Affiliations:
  • Purdue University, West Lafayette, IN, USA;Hewlett-Packard Laboratories, Palo Alto, CA, USA;Purdue University, West Lafayette, IN, USA

  • Venue:
  • Proceedings of the 2013 ACM symposium on Document engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automated publishing systems require large databases containing document page layout templates. Most of these layout templates are created manually. A lower cost alternative is to extract document page layouts from existing documents. In order to extract the layout from a scanned document image, it is necessary to perform Optical Font Recognition (OFR) since the font is an important element in layout design. In this paper, we use the Conditional Random Field (CRF) model to perform OFR. First, we extract typographical features of the text. Then, we train the probabilistic model using a log-linear parameterization of CRF. The advantage of using CRF is that it does not assume that the typographical features are independent of each other. We demonstrate the effectiveness of this approach on a set of 616 fonts.