Script-Independent Text Line Segmentation in Freestyle Handwritten Documents

Authors:
Yi Li;Yefeng Zheng;David Doermann;Stefan Jaeger;Yi Li
Affiliations:
-;-;-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2008

Citing 0
Cited 15

Text line segmentation in handwritten documents using Mumford-Shah model

Pattern Recognition
Handwritten Chinese text line segmentation by clustering with distance metric learning

Pattern Recognition
Language identification for handwritten document images using a shape codebook

Pattern Recognition
Handwritten document image segmentation into text lines and words

Pattern Recognition
Simultaneous Document Margin Removal and Skew Correction Based on Corner Detection in Projection Profiles

ICIAP '09 Proceedings of the 15th International Conference on Image Analysis and Processing
Lennard-Jones force field for geometric active contour

Signal Processing
Text Line Segmentation for Unconstrained Handwritten Document Images Using Neighborhood Connected Component Analysis

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Textline information extraction from grayscale camera-captured document images

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
State estimation in a document image and its application in text block identification and text line extraction

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
A new scheme for unconstrained handwritten text-line segmentation

Pattern Recognition
Automatic line and word segmentation applied to densely line-skewed historical handwritten document images

Integrated Computer-Aided Engineering
Text line segmentation for gray scale historical document images

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Similarity-based training set acquisition for continuous handwriting recognition

Information Sciences: an International Journal
Multilingual OCR research and applications: an overview

Proceedings of the 4th International Workshop on Multilingual OCR
Text line extraction for historical document images

Pattern Recognition Letters

Quantified Score

Hi-index	0.14

Visualization

Abstract

Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.