On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Algorithms
Document Layout Analysis Using Pattern Classification Method
ICSC '95 Proceedings of the Third International Computer Science Conference on Image Analysis Applications and Computer Graphics
Arabic Hand-Written Text-Line Extraction
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Segmentation of Bangla Unconstrained Handwritten Text
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Skew Detection of Document Images Using Line Structural Information
ICITA '05 Proceedings of the Third International Conference on Information Technology and Applications (ICITA'05) Volume 2 - Volume 02
Pre-processing Methods for Handwritten Arabic Documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Detecting Text Lines in Handwritten Documents
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Handwritten document retrieval strategies
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Text line segmentation in handwritten documents using Mumford-Shah model
Pattern Recognition
Handwritten document image segmentation into text lines and words
Pattern Recognition
Overlapping and multi-touching text-line segmentation by Block Covering analysis
Pattern Analysis & Applications
Entropy based skew correction of document images
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
A line-based representation for matching words in historical manuscripts
Pattern Recognition Letters
Offline arabic handwritten text recognition: A Survey
ACM Computing Surveys (CSUR)
Multilingual OCR research and applications: an overview
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.01 |
In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps - Coarse text line estimation using primary components which define the line and assignment of diacritic components which are more difficult to associate with a given line. We first estimate local orientation at each primary component to build a sparse similarity graph. We then, use a shortest path algorithm to compute similarities between non-neighboring components. From this graph, we obtain coarse text lines using two estimates obtained from Affinity propagation and Breadth-first search. In the second step, we assign secondary components to each text line. The proposed method is very fast and robust to non-uniform skew and character size variations, normally present in handwritten text lines. We evaluate our method using a pixel-matching criteria, and report 96% accuracy on a dataset of 125 Arabic document images. We also present a proximity analysis on datasets generated by artificially decreasing the spacings between text lines to demonstrate the robustness of our approach.