Speech in noisy environments (spine) adds new dimension to speech recognition R&D

Authors:
Thomas H. Crystal;Astrid Schmidt-Nielsen;Elaine Marsh
Affiliations:
Consultant, Arlington, VA;Naval Research Laboratory, Washington, DC;Naval Research Laboratory, Washington, DC
Venue:
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Year:
2002

Citing 0
Cited 3

Robust techniques for organizing and retrieving spoken documents

EURASIP Journal on Applied Signal Processing
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval
A generalized tamper localization approach for reversible watermarking algorithms

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Naval Research Laboratory, with DARPA sponsorship, conducted the SPINE-1 and SPINE-2 evaluations to measure the performance of automatic speech recognizers to process military-style speech in simulated military environments. Nine different organizations ran a total of 27 systems on the test material. The overall WER for the primary system from each site was 39.6%. The best performing system's WER was 27.5%. The environments included both background noise and voice coders. The more difficult coder-noise combinations generally had fewer speech turns and fewer words/s than the easier ones. The easier environments appeared to have more chit-chat than the more difficult ones, but because of other compensations, type counts were similar for easy and difficult environments. The changes in talker behavior influenced the accuracy of speech recognizers.