Arabic-document compression: A close look at group 3 international digital facsimile coding standards

  • Authors:
  • A. Kh. Al Jabri

  • Affiliations:
  • -

  • Venue:
  • Computer Standards & Interfaces
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient bit-representation or compression of documents is an important issue in many applications. The amount of compression depends on the document contents such as written scripts, diagrams, tables, etc. The contents of the document determine the limit of this compression. In the CCITT Recommendation T.4, 'Standardization of group 3 apparatus for document transmission', a modified Huffman code was chosen as the standard compression technique [1]. The selection is based on examining documents with contents of different natures. With the cursive nature and the domination of certain shapes in printed Arabic, one may be curious to know the compression efficiency of the chosen standard for documents with printed Arabic contents. For this purpose, more than ten documents containing printed Arabic script have been scanned and analyzed in this paper. Both the entropy, based on the Capon model [5], and the compression rates using the modified Huffman code are calculated. Our results show that the CCITT coding standard seems to be robust for documents with printed Arabic script.