Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm
Pattern Recognition Letters
A fast parallel algorithm for thinning digital patterns
Communications of the ACM
The Role of Holistic Paradigms in Handwritten Word Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Shape Analysis and Classification: Theory and Practice
Shape Analysis and Classification: Theory and Practice
Scale Space Technique for Word Segmentation in Handwritten Documents
SCALE-SPACE '99 Proceedings of the Second International Conference on Scale-Space Theories in Computer Vision
Transcript Mapping for Historic Handwritten Document Images
IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Arabic Hand-Written Text-Line Extraction
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Machine Printed Text and Handwriting Identification in Noisy Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Document Image Analysis for World War II Personal Records
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Holistic Word Recognition for Handwritten Historical Documents
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Line Separation for Complex Document Images Using Fuzzy Runlength
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Text Extraction from Gray Scale Historical Document Images Using Adaptive Local Connectivity Map
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Separating Lines of Text in Free-Form Handwritten Historical Documents
DIAL '06 Proceedings of the Second International Conference on Document Image Analysis for Libraries
Complex Handwritten Page Segmentation Using Contextual Models
DIAL '06 Proceedings of the Second International Conference on Document Image Analysis for Libraries
Text line segmentation of historical documents: a survey
International Journal on Document Analysis and Recognition
A new Tsallis entropy-based thresholding algorithm for images of historical documents
Proceedings of the 2007 ACM symposium on Document engineering
An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Handwriting Segmentation Contest
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Script-Independent Text Line Segmentation in Freestyle Handwritten Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid sampling for imbalanced data
Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
Handwritten document image segmentation into text lines and words
Pattern Recognition
Line Segmentation for Degraded Handwritten Historical Documents
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Recognition strategies for general handwritten text documents
Integrated Computer-Aided Engineering
Integrated Computer-Aided Engineering
An incremental-encoding evolutionary algorithm for color reduction in images
Integrated Computer-Aided Engineering
Segmentation of connected handwritten digits using Self-Organizing Maps
Expert Systems with Applications: An International Journal
Integration of emerging computer technologies for an efficient image sequences analysis
Integrated Computer-Aided Engineering
A generalization of quad-trees applied to image coding
Integrated Computer-Aided Engineering
Reconstruction of occluded facial images using asymmetrical Principal Component Analysis
Integrated Computer-Aided Engineering
A new thresholding algorithm for document images based on the perception of objects by distance
Integrated Computer-Aided Engineering
Hi-index | 0.00 |
There exists a high interest in the digitization of handwriting historical documents, in the quest to preserve the cultural heritage of nations. In general, these manuscript images present new segmentation difficulties with respect to non-historical documents. The problems come from features such as paper aging, faded ink, back-to-front ink superposition or variable line skew, among others. This paper presents a methodology for detecting and extracting the text lines of images from complex handwritten historical documents. The proposed line segmentation algorithm is based on computing a binary transition map of the document and then extracting and refining the corresponding line regions through skeletonization. To improve the accuracy of line segmentation, a new graph-based splitting method to separate the touching lines is introduced. Once text lines have been segmented, we propose an algorithm based on mathematical morphology operators and position heuristics, to extract the component words on each text line. The robustness and accuracy of our approach was tested on digitalized pages of two complex historical document datasets: the correspondence of Nabuco and the family papers of Graham Bell. We have also successfully compared our algorithms to other general line and word segmentation algorithms presented at the ICDAR 2007 Handwriting Segmentation Contest.