Text line extraction from multi-skewed handwritten documents
Pattern Recognition
Transformation of arc-form-text to linear-form-text suitable for OCR
Pattern Recognition Letters
A fast skew detection and correction algorithm for machine printed words in Gurmukhi script
Proceedings of the International Workshop on Multilingual OCR
Multi-oriented english text line identification
SCIA'03 Proceedings of the 13th Scandinavian conference on Image analysis
A new approach for instance-based skew estimation
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part IV
Hi-index | 0.00 |
Abstract: There are many documents where text lines are not parallel to each other i.e. these line have different inclinations with the horizontal lines (multi-skew documents). For the OCR of such a document we have to estimate the skew angle of individual text lines because a single rotation cannot de-skew all text lines of the document. In this paper, we describe a robust technique for multi-skew angle detection from Indian scripts Devnagari and Bangla. Most characters in these scripts have horizontal lines at the top, called headlines. The character head-lines usually connect one another in a word and the word appears as a single component. In the proposed method, the connected components are at first labeled and selected. The upper envelopes of selected components are found by column-wise scanning from the top of the component. Portions of the upper envelope satisfying the properties of a digital straight line are detected. They are then clustered into groups belonging to single text lines. Estimates from these individual clusters give the skew angle of each text line. The proposed multi-skew detection technique has an accuracy about 98.3%.