Position specific posterior lattices for indexing speech

  • Authors:
  • Ciprian Chelba;Alex Acero

  • Affiliations:
  • Microsoft Research, Microsoft Corporation, Redmond, WA;Microsoft Research, Microsoft Corporation, Redmond, WA

  • Venue:
  • ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents the Position Specific Posterior Lattice, a novel representation of automatic speech recognition lattices that naturally lends itself to efficient indexing of position information and subsequent relevance ranking of spoken documents using proximity.In experiments performed on a collection of lecture recordings --- MIT iCampus data --- the spoken document ranking accuracy was improved by 20% relative over the commonly used baseline of indexing the 1-best output from an automatic speech recognizer. The Mean Average Precision (MAP) increased from 0.53 when using 1-best output to 0.62 when using the new lattice representation. The reference used for evaluation is the output of a standard retrieval engine working on the manual transcription of the speech collection.Albeit lossy, the PSPL lattice is also much more compact than the ASR 3-gram lattice from which it is computed --- which translates in reduced inverted index size as well --- at virtually no degradation in word-error-rate performance. Since new paths are introduced in the lattice, the ORACLE accuracy increases over the original ASR lattice.