Storing text using integer codes

  • Authors:
  • Raja Noor Ainon

  • Affiliations:
  • University of Malaya, Kuala Lumpur, Malaysia

  • Venue:
  • COLING '86 Proceedings of the 11th coference on Computational linguistics
  • Year:
  • 1986

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditionally, text is stored on computers as a stream of characters. The goal of this research is to store text in a form that facilitates word manipulation whilst reducing storage space. A word list with syntactic linear ordering is stored and words in a text are given two-byte integar codes that point to their respective positions in this list. The implementation of the encoding scheme is described and the performance statistics of this encoding scheme is presented.