Speech in noisy environments (spine) adds new dimension to speech recognition R&D

  • Authors:
  • Thomas H. Crystal;Astrid Schmidt-Nielsen;Elaine Marsh

  • Affiliations:
  • Consultant, Arlington, VA;Naval Research Laboratory, Washington, DC;Naval Research Laboratory, Washington, DC

  • Venue:
  • HLT '02 Proceedings of the second international conference on Human Language Technology Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Naval Research Laboratory, with DARPA sponsorship, conducted the SPINE-1 and SPINE-2 evaluations to measure the performance of automatic speech recognizers to process military-style speech in simulated military environments. Nine different organizations ran a total of 27 systems on the test material. The overall WER for the primary system from each site was 39.6%. The best performing system's WER was 27.5%. The environments included both background noise and voice coders. The more difficult coder-noise combinations generally had fewer speech turns and fewer words/s than the easier ones. The easier environments appeared to have more chit-chat than the more difficult ones, but because of other compensations, type counts were similar for easy and difficult environments. The changes in talker behavior influenced the accuracy of speech recognizers.