Automated entry system for printed documents
Pattern Recognition
Layout extraction of mixed mode documents
Machine Vision and Applications
Algorithm for text page up/down orientation determination
Pattern Recognition Letters
An Object/Segment Oriented Skew-Correction Technique for Document Images
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Hi-index | 0.01 |
This paper presents a method for determining the up/down orientation of text in a scanned document of unknown orientation, so that it can be appropriately rotated and processed by an optical character recognition (OCR) engine. The method analyzes the ''open'' portions of text blobs to determine the direction in which the open portions face. By determining the respective densities of blobs opening in a pair of opposite directions (e.g., right or left), the method can establish the direction in which the text as a whole is oriented. We first describe a method for determining the up/down orientation of roman text based on the asymmetry in the openness of most roman letters in the horizontal direction. For non-roman text such as Pashto and Hebrew, we provide a method that determines a direction that is the most asymmetric, and therefore the most useful for the determination of text orientation, given a training data set of documents of known orientation. This work can be adapted for use in automated mail processing or to determine the orientation of checks in automated teller machine envelopes, scanned or copied documents, documents sent via facsimile, and digital photographs that include text (e.g., road signs, business cards, driver's licenses), among other applications.