Document identification using shape trees

Authors:
Uwe Henker;Uwe Petersohn
Affiliations:
(Correspd. E-mail: u.henker@docexpert.de) DOCexpert Computer GmbH, Kirschhäckerstraße 27, D-96052 Bamberg, Germany;Department of Computer Science, Dresden University of Technology, D-01062 Dresden, Germany
Venue:
International Journal of Hybrid Intelligent Systems
Year:
2009

Citing 3
Cited 0

Digital Picture Processing

Digital Picture Processing
Document page decomposition by the bounding-box project

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Document Ranking by Layout Relevance

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a technique to identify documents using an abstraction of a scanned image. The types of documents to identify must be known to the system a priori. To this end, the necessary features are saved in a case base [8] as shape trees. This file also contains rules for possible further processing. In an extremely reduced image, it is possible to filter out the significant, distinguishing information from the image and recognize it using case-based reasoning (CBR) [8]. This method has been demonstrated and proven by an example of experiments using medical order forms. An average of 97% of the forms were correctly identified; none were identified incorrectly.