Training on Severely Degraded Text-Line Images

  • Authors:
  • Prateek Sarkar;Henry S. Baird;Xiaohu Zhang

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show that document image decoding (DID) supervisedtraining algorithms, as a result of recent refinements,achieve high accuracy with low manual effort even underconditions of severe image degradation in both trainingand test data. We describe improvements in DID trainingof character template, set-width, and channel (noise) models.Large-scale experimental trials, using synthetically degradedimages of text, have established two new and practicallyimportant advantages of DID algorithms:1. high accuracy ( 99% chraracters correct) in decodingusing models trained on even severely degradedimages from the same distribution; and2. greatly improved accuracy (