Analytic assessment of telephone transmission impact on ASR performance using a simulation model

Authors:
Sebastian Möller;Hervé Bourlard
Affiliations:
Institut für Kommunikationsakustik, Ruhr-Universität Bochum, D-44780 Bochum, Germany;IDIAP -- Institut dalle Molle d' Intelligence Artificielle Perceptive, CP 592, CH-1920 Martigny, Switzerland
Venue:
Speech Communication
Year:
2002

Citing 9
Cited 0

Speech recognition by machines and humans

Speech Communication
Towards improving ASR robustness for PSN and GSM telephone applications

Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
POLYCOST: A telephone-speech database for speaker recognition

Speech Communication - Speaker recognition and its commercial and forensic applications
Is ASR ready for wireless primetime: measuring the core technology for selected applications

Speech Communication - Special issue on interactive voice technology for telecommunication applications
Telephone speech quality prediction: towards network planning and monitoring models for modern network scenarios

Speech Communication
European Speech Databases for Telephone Applications

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
PARADISE: a framework for evaluating spoken dialogue agents

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Training of HMM with filtered speech material for hands-free recognition

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
The ETSI computation model: a tool for transmission planning of telephone networks

IEEE Communications Magazine

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper addresses the impact of telephone transmission channels on automatic speech recognition (ASR) performance. A real-time simulation model is described and implemented, which allows impairments that are encountered in traditional as well as modern (mobile, IP-based) networks to be flexibly and efficiently generated. The model is based on input parameters which are known to telephone network planners; thus, it can be applied without measuring specific network characteristics. It can be used for an analytic assessment of the impact of channel impairments on ASR performance, for producing training material with defined transmission characteristics, or for testing spoken dialogue systems in realistic network environments. In the present paper, we present an investigation of the first point. Two speech recognizers which are integrated into a spoken dialogue system for information retrieval are assessed in relation to controlled amounts of transmission degradations. The measured ASR performance degradation is compared to speech quality degradation in human-human communication. It turns out that ASR shows a different behavior than expected human quality judgments for some impairments. This fact has to be taken into account in both telephone network planning as well as in speech and language technology development.