Word Searching in CCITT Group 4 Compressed Document Images

  • Authors:
  • Yue Lu;Chew Lim Tan

  • Affiliations:
  • -;-

  • Venue:
  • ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a compressed patternmatching method for searching user queried words in theCCITT Group 4 compressed document images, withoutdecompressing. The feature pixels composed of blackchanging elements and white changing elements areextracted directly from the CCITT Group 4 compresseddocument images. The connected components are labeledbased on a line-by-line strategy according to the relativepositions between the changing elements of the currentcoding line and the changing elements of the referenceline. Word boxes are bounded by merging the connectedcomponents. A two-stage matching strategy is constructedto measure the dissimilarity between the template imageof the user's query word and the words extracted fromdocument images. Experimental results confirmed thevalidity of the proposed approach.