Finite automata for compact representation of tuple dictionaries

  • Authors:
  • Jan Daciuk;Gertjan van Noord

  • Affiliations:
  • Alfa-Informatica, Rijksuniversiteit Groningen, Oude Kijk in 't Jatstraat 26, Postbus 716, 9700 AS Groningen, The Netherlands;Alfa-Informatica, Rijksuniversiteit Groningen, Oude Kijk in 't Jatstraat 26, Postbus 716, 9700 AS Groningen, The Netherlands

  • Venue:
  • Theoretical Computer Science - Implementation and application automata
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A generalization of the dictionary data structure is described, called tuple dictionary. A tuple dictionary represents the mapping of n-tuples of strings to some value. This data structure is motivated by practical applications in speech and language processing, in which very large instances of tuple dictionaries are used to represent language models. A technique for compact representation of tuple dictionaries is presented. The technique can be seen as an application and extension of perfect hashing by means of finite-state automata. Preliminary practical experiments indicate that the technique yields considerable and important space savings of up to 90% in practice.