Markov logic networks for document layout correction

  • Authors:
  • Stefano Ferilli;Teresa M. A. Basile;Nicola Di Mauro

  • Affiliations:
  • Department of Computer Science, LACAM laboratory, University of Bari "Aldo Moro", Bari;Department of Computer Science, LACAM laboratory, University of Bari "Aldo Moro", Bari;Department of Computer Science, LACAM laboratory, University of Bari "Aldo Moro", Bari

  • Venue:
  • IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The huge amount of documents in digital formats raised the need of effective content-based retrieval techniques. Since manual indexing is infeasible and subjective, automatic techniques are the obvious solution. In particular, the ability of properly identifying and understanding a document's structure is crucial, in order to focus on the most significant components only. Thus, the quality of the layout analysis outcome biases the next understanding steps. Unfortunately, due to the variety of document styles and formats, the automatically found structure often needs to be manually adjusted. In this work we present a tool based on Markov Logic Networks to infer corrections rules to be applied to forthcoming documents. The proposed tool, embedded in a prototypical version of the document processing system DOMINUS, revealed good performance in real-world experiments.