Speech recognition by machines and humans
Speech Communication
Towards improving ASR robustness for PSN and GSM telephone applications
Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
POLYCOST: A telephone-speech database for speaker recognition
Speech Communication - Speaker recognition and its commercial and forensic applications
Is ASR ready for wireless primetime: measuring the core technology for selected applications
Speech Communication - Special issue on interactive voice technology for telecommunication applications
European Speech Databases for Telephone Applications
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
PARADISE: a framework for evaluating spoken dialogue agents
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Training of HMM with filtered speech material for hands-free recognition
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
The ETSI computation model: a tool for transmission planning of telephone networks
IEEE Communications Magazine
Hi-index | 0.01 |
This paper addresses the impact of telephone transmission channels on automatic speech recognition (ASR) performance. A real-time simulation model is described and implemented, which allows impairments that are encountered in traditional as well as modern (mobile, IP-based) networks to be flexibly and efficiently generated. The model is based on input parameters which are known to telephone network planners; thus, it can be applied without measuring specific network characteristics. It can be used for an analytic assessment of the impact of channel impairments on ASR performance, for producing training material with defined transmission characteristics, or for testing spoken dialogue systems in realistic network environments. In the present paper, we present an investigation of the first point. Two speech recognizers which are integrated into a spoken dialogue system for information retrieval are assessed in relation to controlled amounts of transmission degradations. The measured ASR performance degradation is compared to speech quality degradation in human-human communication. It turns out that ASR shows a different behavior than expected human quality judgments for some impairments. This fact has to be taken into account in both telephone network planning as well as in speech and language technology development.