Multi-scale Document Description Using Rectangular Granulometries

  • Authors:
  • Andrew D. Bagdanov;Marcel Worring

  • Affiliations:
  • -;-

  • Venue:
  • DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

When comparing documents images based on visual similarity it is difficult to determine the correct scale and features for document representation. We report on new form of multivariate granulometries based on rectangles of varying size and aspect ratio. These rectangular granulometries are used to probe the layout structure of document images, and the rectangular size distributions derived from them are used as descriptors for document images. Feature selection is used to reduce the dimensionality and redundancy of the size distributions, while preserving the essence of the visual appearance of a document. Experimental results indicate that rectangular size distributions are an effective way to characterize visual similarity of document images and provide insightful interpretation of classification and retrieval results in the original image space rather than the abstract feature space.