A ground truth bleed-through document image database
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
GD'12 Proceedings of the 20th international conference on Graph Drawing
Proceedings of the 2013 ACM symposium on Document engineering
Nonrigid recto-verso registration using page outline structure and content preserving warps
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Hi-index | 0.00 |
Many types of degradation can render ancient manuscripts very hard to read. In bleed-through, the text from the reverse, or verso, side of a page seeps through into the front, or recto. In this paper, we propose hysteresis thresholding to greatly reduce bleed-through. Thresholding alone cannot properly separate ink and bleed-through because the ranges of intensities for the two classes overlap. Hysteresis thresholding overcomes this limitation via the two steps of thresholding and ink regrowth. In order to provide quantitative measures of the effectiveness of this approach, we constructed a novel dataset which features bleed-through and has available ground truth. We evaluated our method and a number of previously proposed approaches on ink pixel precision and recall. Hysteresis thresholding significantly improves over existing methods.