Logical document conversion: combining functional and formal knowledge

  • Authors:
  • Hervé Déjean;Jean-Luc Meunier

  • Affiliations:
  • Xerox Research Centre Europe;Xerox Research Centre Europe

  • Venue:
  • Proceedings of the 2007 ACM symposium on Document engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present in this paper a method for document layout analysis based on identifying the function of document elements (what they do). This approach is orthogonal and complementary to the traditional view based on the form of document elements (how they are constructed). One key advantage of such functional knowledge is that the functions of some document elements are very stable from document to document and over time. Relying on the stability of such functions, the method is not impacted by layout variability, a key issue in logical document analysis and is thus very robust and versatile. The method starts the recognition process by using functional knowledge and uses in a second step formal knowledge as a source of feedback in order to correct some errors. This allows the method to adapt to specific documents by using formal specificities.