Models of English Text

  • Authors:
  • W. J. Teahan;John G. Cleary

  • Affiliations:
  • -;-

  • Venue:
  • DCC '97 Proceedings of the Conference on Data Compression
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of constructing models of English text is considered. A number of applications of such models including cryptology, spelling correction and speech recognition are reviewed. The best current models for English text have been the result of research into compression. Not only is this an important application of such models but the amount of compression provides a measure of how well such models perform. Three main classes of models are considered: character based models, word based models, and models which use auxiliary information in the form of parts of speech. These models are compared in terms of their memory usage and compression.