Text line extraction for historical document images
Pattern Recognition Letters
Hi-index | 0.00 |
This paper presents an approach to text line extraction in handwritten document images which combines local and global techniques. We propose a graph-based technique to detect touching and proximity errors that are common with handwritten text lines. In a refinement step, we use Expectation-Maximization (EM) to iteratively split the error segments to obtain correct text-lines. We show improvement in accuracies using our correction method on datasets of Arabic document images. Results on a set of artificially generated proximity images show that the method is effective for handling touching errors in handwritten document images.