Layout based document image retrieval by means of XY tree reduction

  • Authors:
  • Simone Marinai;Emanuele Marino;Giovanni Soda

  • Affiliations:
  • DSI - University of Florence, Italy;DSI - University of Florence, Italy;DSI - University of Florence, Italy

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We analyze a system for the retrieval of document images on the basis of layout similarity. Layout objects are extracted and represented with the XY tree. Page similarity is computed with a tree-edit distance algorithm. The peculiarity of the approach is the use of tree grammars to model the variations in the tree which are due to segmentation algorithms or to structural differences between documents with similar layout. A few class-independent grammatical rules are used to modify each tree and obtain a reduced tree that is supposed to preserve the most relevant features of the page.