Multifont Classification Using Typographical Attributes

  • Authors:
  • M. Jung;Y. Shin;S. Srihari

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
  • Year:
  • 1999

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper introduces a multifont classification scheme to help recognition of multifont and multisize characters. It uses typographical attributes such as ascenders, descenders and serifs obtained from a word image. The attributes are used as an input to a neural network classifier to produce the multifont classification results. It can classify 7 commonly used fonts for all point sizes from 7 to 18. The approach developed in this scheme can handle a wide range of image quality even with severely touching characters. The detection of the font can improve the character segmentation as well as the character recognition because the identification of the font provides information on the structure and the typographical design of characters. Therefore, this multifont classification algorithm can be used in maintaining good recognition rates of a machine printed OCR system regardless of fonts and sizes. Experiments have shown font classification accuracies reach high performance levels of about 95 percent even with severely touching characters. The technique developed for the selected 7 fonts in this paper can be applied to any other fonts.