On the Determinization of Weighted Finite Automata

  • Authors:
  • Adam L. Buchsbaum;Raffaele Giancarlo;Jeffery R. Westbrook

  • Affiliations:
  • -;-;-

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

We study the problem of constructing the deterministic equivalent of a nondeterministic weighted finite-state automaton (WFA). Determinization of WFAs has important applications in automatic speech recognition (ASR). We provide the first polynomial-time algorithm to test for the twins property, which determines if a WFA admits a deterministic equivalent. We also give upper bounds on the size of the deterministic equivalent; the bound is tight in the case of acyclic WFAs. Previously, Mohri presented a superpolynomial-time algorithm to test for the twins property, and he also gave an algorithm to determinize WFAs. He showed that the latter runs in time linear in the size of the output when a deterministic equivalent exists; otherwise, it does not terminate. Our bounds imply an upper bound on the running time of this algorithm.Given that WFAs can expand exponentially in size when determinized, we explore why those that occur in ASR tend to shrink when determinized. According to ASR folklore, this phenomenon is attributable solely to the fact that ASR WFAs have simple topology, in particular, that they are acyclic and layered. We introduce a very simple class of WFAs with this structure, but we show that the expansion under determinization depends on the transition weights: some weightings cause them to shrink, while others, including random weightings, cause them to expand exponentially. We provide experimental evidence that ASR WFAs exhibit this weight dependence. That they shrink when determinized, therefore, is a result of favorable weightings in addition to special topology. These analyses and observations have been used to design a new, approximate WFA determinization algorithm, reported in a separate paper along with experimental results showing that it achieves significant WFA size reduction with negligible impact on ASR performance.