A Fast Algorithm for Bottom-Up Document Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration
IEEE Transactions on Pattern Analysis and Machine Intelligence
Two Geometric Algorithms for Layout Analysis
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Correcting the Document Layout: A Machine Learning Approach
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
A Statistical Learning Approach To Document Image Analysis
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Machine Learning
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Hi-index | 0.00 |
The huge amount of documents in digital formats raised the need of effective content-based retrieval techniques. Since manual indexing is infeasible and subjective, automatic techniques are the obvious solution. In particular, the ability of properly identifying and understanding a document's structure is crucial, in order to focus on the most significant components only. Thus, the quality of the layout analysis outcome biases the next understanding steps. Unfortunately, due to the variety of document styles and formats, the automatically found structure often needs to be manually adjusted. In this work we present a tool based on Markov Logic Networks to infer corrections rules to be applied to forthcoming documents. The proposed tool, embedded in a prototypical version of the document processing system DOMINUS, revealed good performance in real-world experiments.