Algorithm for text page up/down orientation determination
Pattern Recognition Letters
The Document Spectrum for Page Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Analysis of document snippets as a basis for reconstruction
VAST'09 Proceedings of the 10th International conference on Virtual Reality, Archaeology and Cultural Heritage
Hi-index | 0.00 |
This paper presents a method for determining the up/down orientation of text in a scanned document of unknown orientation. The method analyzes the "open" portions of text blobs to determine the direction in which the open portions face. By determining the respective densities of blobs opening in a pair of opposite directions (e.g., right or left), the method can establish the direction in which the text as a whole is oriented. We first discuss the orientation of roman text based on the asymmetry in the openness of roman letters in the horizontal direction. For non-roman text such as Pashto and Hebrew, we determine a direction that is the most asymmetric, and therefore the most useful for orientation, given a training dataset. This direction is then used for orientation. This work can be used for automated orientation of mail, checks in ATM envelopes, and scanned, copied, or faxed documents.