Document Image Decoding Using Iterated Complete Path Search with Subsampled Heuristic Scoring

Authors:
Affiliations:
Venue:
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Year:
2001

Citing 0
Cited 2

Training on Severely Degraded Text-Line Images

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Robust document image understanding technologies

Proceedings of the 1st ACM workshop on Hardcopy document processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: It has been shown that the computation time of Document Image Decoding can be significantly reduced by employing heuristics in the search for the best decoding of a text line. In the Iterated Complete Path (ICP) method, template matches are performed only along the best path found by dynamic programming on each iteration. When the best path stabilizes, the decoding is optimal and no more template matches need be performed. In this way, only a tiny fraction of potential template matches must be evaluated, and the computation time is typically dominated by the evaluation of the initial heuristic upper-bound for each template at each location in the image. The time to compute this bound depends on the resolution at which the matching scores are found. At lower resolution, the heuristic computation is reduced, but because a weaker bound is used, the number of Viterbi iterations is increased. We present the optimal (lowest upper-bound) heuristic for any degree of subsampling of multilevel template and/or interpolation, for use in text line decoding with ICP. The optimal degree of subsampling depends on image quality, but it is typically found that a small amount of template subsampling is effective in reducing the overall decoding time.