Document identification using shape trees

  • Authors:
  • Uwe Henker;Uwe Petersohn

  • Affiliations:
  • (Correspd. E-mail: u.henker@docexpert.de) DOCexpert Computer GmbH, Kirschhäckerstraße 27, D-96052 Bamberg, Germany;Department of Computer Science, Dresden University of Technology, D-01062 Dresden, Germany

  • Venue:
  • International Journal of Hybrid Intelligent Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a technique to identify documents using an abstraction of a scanned image. The types of documents to identify must be known to the system a priori. To this end, the necessary features are saved in a case base [8] as shape trees. This file also contains rules for possible further processing. In an extremely reduced image, it is possible to filter out the significant, distinguishing information from the image and recognize it using case-based reasoning (CBR) [8]. This method has been demonstrated and proven by an example of experiments using medical order forms. An average of 97% of the forms were correctly identified; none were identified incorrectly.