Adaptive Segmentation of Document Images

Authors:
Don Sylwester
Affiliations:
-
Venue:
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Year:
2001

Citing 0
Cited 2

Machine Printed Text and Handwriting Identification in Noisy Document Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to segment document images

PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: A single-parameter text-line extraction algorithm is described along with an efficient technique for estimating the optimal value for the parameter for individual images without need for ground truth. The algorithm is based on three simple tree operations, cut, glue and flip. An XY-tree representing the segmentation is incrementally transformed to reflect a change in the parameter while intrinsic measures of the cost of the transformation are used to detect when specific tree operations would cause an error if they were performed, allowing these errors to be avoided. The algorithm correctly identified 98.8% of the area of the ground truth bounding boxes and committed no column bridging errors on a set of 97 test images selected from a variety of technical journals.