A Lexicon Reduction Strategy in the Context of Handwritten Medical Forms

  • Authors:
  • Robert Milewski;Srirangaraj Setlur;Venu Govindaraju

  • Affiliations:
  • University at Buffalo, State University of New York;University at Buffalo, State University of New York;University at Buffalo, State University of New York

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Traditional handwriting recognition algorithms rely heavily on small lexicons and clean word images. Unfortunately, emergency medical documents do not satisfy either of these conditions. This is a significant road-block that is hampering efforts to rapidly convert valuable offline healthcare handwriting data into digital content that can be efficiently mined for information. This paper describes a strategy whereby given an image representing a noisy handwritten word from a medical document, and a large lexicon consisting of English, medical and pharmacological words, symbols, abbreviations and acronyms, significantly reduces the size of the lexicon while keeping the unknown desired entry within the lexicon. The approach combines geometric interpretations of the word image along with contextual inference of concepts to reduce lexicons for word recognition. The data extracted can then be efficiently and securely disseminated for epidemiological and outbreak detection/analysis. Experimental results on NY State PCR forms are reported.