ATREUS: a comparative study of continuous speech recognition systems at ATR

  • Authors:
  • A. Nagai;K. Yamaguchi;S. Sagayama;A. Kurematsu

  • Affiliations:
  • ATR Interpreting Telephony Research Laboratories, Kyoto, Japan;ATR Interpreting Telephony Research Laboratories, Kyoto, Japan;ATR Interpreting Telephony Research Laboratories, Kyoto, Japan;ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes ATREUS, an aggregation of a large variety of continuous speech recognition systems developed at ATR Interpreting Telephony Research Laboratories forming the spoken input front-end of an interpreting telephony system. ATREUS includes the following phone models: (1) discrete HMMs with fuzzy vector quantization and multiple codebooks, (2) continuous mixture density HMMs, (3) Hidden Markov networks derived from the Successive State Splitting algorithm, (4) Time-delay Neural Networks, and (5) Fuzzy Partition Models. Its speaker modes involve (a) speaker-dependent, (b) speaker- independent, and (c) speaker-adaptive techniques such as codebook mapping for VQ-HMMs, vector field smoothing for all types of HMMs, and neural network speaker mapping. ATREUS is one of the major achievements in the seven-year automatic interpreting telephony project, scheduled to end at the end of this fiscal year. A comparative study is given from the view points of structure, constituent techniques, hardware implementation and performance. A combination called ATREUS/SSS-LR had the best performance among the ATREUS systems.