A Hough based algorithm for extracting text lines in handwritten documents
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Line Separation for Complex Document Images Using Fuzzy Runlength
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Text line extraction from multi-skewed handwritten documents
Pattern Recognition
Handwriting Segmentation Contest
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Script-Independent Text Line Segmentation in Freestyle Handwritten Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Text line extraction is the first and one of the most critical steps in optical character recognition (OCR) of unconstrained handwritten documents. The present work reports a new methodology based on comparison of neighborhood connected components to determine whether they belong to the same text line. Components which are very small or very large compared to the average component height are ignored in the preprocessing step. During post-processing, such components are reconsidered and allocated to the lines to which they most suitably belong. The performance of the developed technique is evaluated on the benchmark training dataset for the ICDAR 2009 handwriting segmentation contest. The dataset consists of English, French, German and Greek handwritten texts. The overall text line identification accuracy on the mentioned dataset is observed to be around 93.35%.