Text line segmentation in handwritten documents using Mumford-Shah model
Pattern Recognition
Language identification for handwritten document images using a shape codebook
Pattern Recognition
Handwritten document image segmentation into text lines and words
Pattern Recognition
ICIAP '09 Proceedings of the 15th International Conference on Image Analysis and Processing
Lennard-Jones force field for geometric active contour
Signal Processing
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Textline information extraction from grayscale camera-captured document images
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
A new scheme for unconstrained handwritten text-line segmentation
Pattern Recognition
Integrated Computer-Aided Engineering
Text line segmentation for gray scale historical document images
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Similarity-based training set acquisition for continuous handwriting recognition
Information Sciences: an International Journal
Multilingual OCR research and applications: an overview
Proceedings of the 4th International Workshop on Multilingual OCR
Text line extraction for historical document images
Pattern Recognition Letters
Hi-index | 0.14 |
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.