MOrpho-LEXical Analysis for Correcting OCR-Generated Arabic Words (MOLEX)

  • Authors:
  • Toufik Sari;Mokhtar Sellami

  • Affiliations:
  • -;-

  • Venue:
  • IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a contextual-based method for correcting Arabic words generated by OCR systems. This technique operates as a post-processor and it wants to be universal. Itcorrects substitution and rejection errors. The Arabic language properties are very useful in morpho-lexical analysis and therefore they are strongly exploited in the development of themethod. The substitution errors, the most frequently ommitted ones by the OCR systems, are rewritten in production rules to be used by a rule-based system for correcting Arabi words. The first version of the developed method operates only at the morpho-lexical level, the extension to the other levels of language analysis is considered in perspectives.