Training on Severely Degraded Text-Line Images

Authors:
Prateek Sarkar;Henry S. Baird;Xiaohu Zhang
Affiliations:
-;-;-
Venue:
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Year:
2003

Citing 8
Cited 4

Introduction to Digital Typography

Introduction to Digital Typography
Document Image Decoding by Heuristic Search

IEEE Transactions on Pattern Analysis and Machine Intelligence
Supervised Template Estimation for Document Image Decoding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Prototype Extraction and Adaptive OCR

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optical Character Recognition: An Illustrated Guide to the Frontier

Optical Character Recognition: An Illustrated Guide to the Frontier
Document Image Decoding Using Markov Source Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Document Image Decoding Using Iterated Complete Path Search with Subsampled Heuristic Scoring

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)

Robust document image understanding technologies

Proceedings of the 1st ACM workshop on Hardcopy document processing
Style Consistent Classification of Isogenous Patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adaptive OCR with Limited User Feedback

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Document image analysis for digital libraries

Proceedings of the 2006 international workshop on Research issues in digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that document image decoding (DID) supervisedtraining algorithms, as a result of recent refinements,achieve high accuracy with low manual effort even underconditions of severe image degradation in both trainingand test data. We describe improvements in DID trainingof character template, set-width, and channel (noise) models.Large-scale experimental trials, using synthetically degradedimages of text, have established two new and practicallyimportant advantages of DID algorithms:1. high accuracy ( 99% chraracters correct) in decodingusing models trained on even severely degradedimages from the same distribution; and2. greatly improved accuracy (