An approach for efficient open vocabulary spoken term detection

  • Authors:
  • Atta Norouzian;Richard Rose

  • Affiliations:
  • -;-

  • Venue:
  • Speech Communication
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

A hybrid two-pass approach for facilitating fast and efficient open vocabulary spoken term detection (STD) is presented in this paper. A large vocabulary continuous speech recognition (LVCSR) system is deployed for producing word lattices from audio recordings. An index construction technique is used for facilitating very fast search of lattices for finding occurrences of both in vocabulary (IV) and out of vocabulary (OOV) query terms. Efficient search for query terms is performed in two passes. In the first pass, a subword approach is used for identifying audio segments that are likely to contain occurrences of the IV and OOV query terms from the index. A more detailed subword based search is performed in the second pass for verifying the occurrence of the query terms in the candidate segments. The performance of this STD system is evaluated in an open vocabulary STD task defined on a lecture domain corpus. It is shown that the indexing method presented here results in an index that is nearly two orders of magnitude smaller than the LVCSR lattices while preserving most of the information relevant for STD. Furthermore, despite using word lattices for constructing the index, 67% of the segments containing occurrences of the OOV query terms are identified from the index in the first pass. Finally, it is shown that the detection performance of the subword based term detection performed in the second pass has the effect of reducing the performance gap between OOV and IV query terms.