The Multistage Approach to Information Extraction in Degraded Document Images

  • Authors:
  • Chen Yan;Graham Leedham

  • Affiliations:
  • Nanyang Technological University, Singapore;Nanyang Technological University, Singapore

  • Venue:
  • ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Global and local adaptive thresholding techniques have been shown effective on particular types of documents. None produces consistently good results on all types of documents. In this paper a novel method, called the multistage-approach, is presented and compared against some existing single-stage algorithms. The multistage approach recursively breaks down an image into sub-regions using quad-tree decomposition and extracts local features from each sub-region until an appropriate thresholding method can be applied to each sub-region. Quantitative analysis using word recall and on 300 degraded historical images obtained from the Library of Congress demonstrate the method is superior to any existing single methods.