Boosting-based Transductive Learning for Text Detection
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Hi-index | 0.00 |
We describe an image analysis system for handling complex and noisy images of forms and bank documents, such as business checks, personal checks, or bank deposits. Some of these document types have no standardized layout, requiring a careful analysis of the whole image, to find out where the relevant information, for example the courtesy amount, is located. Each element in the image is first classified as being part of machine printed text, handwritten text, or as being a graphical element, such as a line. To obtain a reliable identification of these different elements under noisy conditions, a set of templates is scanned over the image, extracting such elements as strokes, line end stops and corners. From this representation a quick and robust analysis of the image's content is possible to identify the different parts. Once a set of candidate subimages has been found, they are sent to a field recognition system. We describe an example of one such system, which locates and reads courtesy amounts on US checks.