The use of context for correcting garbled English text

  • Authors:
  • Charles M. Vossler;Neil M. Branston

  • Affiliations:
  • -;-

  • Venue:
  • ACM '64 Proceedings of the 1964 19th ACM national conference
  • Year:
  • 1964

Quantified Score

Hi-index 0.03

Visualization

Abstract

The paper describes two different methods for using context to correct garbled English text. The first makes use of a dictionary of English words containing their probability of occurrence. The second uses letter digram frequencies to roughly approximate English word probabilities. Probabilities of various letter substitutions are obtained from a confusion matrix of the simulated character recognizer whose operation produced the garbling. This information is combined using a maximum likelihood scheme to obtain word recognition or, if only digram information is available, the recognition of word approximations. In order to test the methods empirically, computer programs were written, and experiments were run using textual material from various sources. Besides a rather limited comparison of the -and-ldquo;dictionary-and-rdquo; and -and-ldquo;digram-and-rdquo; methods on material from a children's primer, a test was made of a combined system on material from newspaper articles and from a book on psychology.