A Statistical Method for an Automatic Detection of Form Types
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
A Fast Multifunctional Approach for Document Image Analysis
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Hi-index | 0.00 |
A major problem in form reading applications is that form fields cannot be located exactly because of nonlinear distortions on the form images. Such nonlinear distortions appear for example on photocopied forms or on forms transmitted by fax. One way to solve this problem is to determine the form fields by considering the positions of the form lines. This paper describes a new method to find pairs of corresponding form lines on a reference form and a filled form. The advantage of this method is that the corresponding line pairs can be used to map any pixel of the filled form and the reference form without any assumption about the kind of distortion. The core of this method is an algorithm that is based on the A*-search algorithm. Two sets of horizontal or vertical lines, one from the reference form and one from the filled form, are searched for pairs of corresponding lines. The algorithm's run time is low and nonlinear distortions of the form images hardly influence its results. With increasing complexity-i.e. increasing number of lines or decreasing image quality-the number of rejected form lines grows, but the error rate stays low.