Segmentation and Validation of Commercial Documents Logical Structure

  • Authors:
  • Miguel Diogenes Matrakas;Flávio Bortolozzi

  • Affiliations:
  • -;-

  • Venue:
  • ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main objective of this work is to present an approach to extract and validate the logical structure from the images that compose a commercial document. The nearest neighbor rule algorithm was used for labeling the elements, and the Run Length Smoothing Algorithm (RLSA) was used to segment the image of a commercial document of the type letter, official letter or memo. The most common classes considered are: date, logotype, text body, signature, addressee, invocation and greeting. The labeling of the elements is accomplished using the nearest neighbor rule algorithm with a vector constituted of 28 characteristics. The accomplished study presented a good result for the classification of elements on commercial documents. It was created and used a base composed of 283 images of commercial documents in 256 gray levels for the document element classification.