Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field

Authors:
Christian Wolf
Affiliations:
Université de Lyon, CNRS, and INSA-Lyon, France
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2010

Citing 0
Cited 7

User-assisted ink-bleed reduction

IEEE Transactions on Image Processing - Special section on distributed camera networks: sensing, processing, communication, and implementation
Visual enhancement of old documents with hyperspectral imaging

Pattern Recognition
A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images

Pattern Recognition
A real-world noisy unstructured handwritten notebook corpus for document image analysis research

Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
A ground truth bleed-through document image database

TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Nonrigid recto-verso registration using page outline structure and content preserving warps

Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
F-measure as the error function to train neural networks

IWANN'13 Proceedings of the 12th international conference on Artificial Neural Networks: advances in computational intelligence - Volume Part I

Quantified Score

Hi-index	0.14

Visualization

Abstract

We present a new method for blind document bleed-through removal based on separate Markov Random Field (MRF) regularization for the recto and for the verso side, where separate priors are derived from the full graph. The segmentation algorithm is based on Bayesian Maximum a Posteriori (MAP) estimation. The advantages of this separate approach are the adaptation of the prior to the contents creation process (e.g., superimposing two handwritten pages), and the improvement of the estimation of the recto pixels through an estimation of the verso pixels covered by recto pixels; moreover, the formulation as a binary labeling problem with two hidden labels per pixels naturally leads to an efficient optimization method based on the minimum cut/maximum flow in a graph. The proposed method is evaluated on scanned document images from the 18th century, showing an improvement of character recognition results compared to other restoration methods.