Two Geometric Algorithms for Layout Analysis

  • Authors:
  • Thomas M. Breuel

  • Affiliations:
  • -

  • Venue:
  • DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents geometric algorithms for solving two key problems in layout analysis: finding a cover of the background whitespace of a document in terms of maximal empty rectangles, and finding constrained maximum likelihood matches of geometric text line models in the presence of geometric obstacles. The algorithms are considerably easier to implement than prior methods, they return globally optimal solutions, and they require no heuristics. The paper also introduces an evaluation function that reliably identifies maximal empty rectangles corresponding to column boundaries. Combining this evaluation function with the two geometric algorithms results in an easy-to-implement layout analysis system. Reliability of the system is demonstrated on documents from the UW3 database.