A Ground-Truthing Tool for Layout Analysis Performance Evaluation
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Document zone content classification and its performance evaluation
Pattern Recognition
GOAL: towards understanding of graphic objects from architectural to line drawings
GREC'09 Proceedings of the 8th international conference on Graphics recognition: achievements, challenges, and evolution
Ground truth for layout analysis performance evaluation
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
There is an increasingly pressing need to develop document analysis methods that are able to cope with images of documents containing printed regions of complex shapes. Contrary to the bounding-box representation used in most past page segmentation and classification approaches which assume rectangular regions, there is a need for a more flexible description which also retains most of the functionality of the representation by rectangles. In the first part of this paper, the practical considerations of describing and handling the complex-shaped regions are examined and an appropriate representation scheme is proposed. For page classification, a new approach based on the description of white space inside regions is presented. In contrast to previous page classification approaches, skewed and complex-shaped regions are handled efficiently and the features are derived with no need for time-consuming accesses of the pixel-based image data.