Correction of medical handwriting OCR based on semantic similarity

  • Authors:
  • Bartosz Broda;Maciej Piasecki

  • Affiliations:
  • Institute of Applied Informatics, Wrocław University of Technology, Poland;Institute of Applied Informatics, Wrocław University of Technology, Poland

  • Venue:
  • IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the paper a method of the correction of handwriting Optical Character Recognition (OCR) based on the semantic similarity is presented. Different versions of the extraction of semantic similarity measures from a corpus are analysed, with the best results achieved for the combination of the text window context and Rank Weight Function. An algorithm of the word sequence selection with the high internal similarity is proposed. The method was trained and applied to a corpus of real medical documents written in Polish.