Using a large vocabulary continuous speech recognizer for a constrained domain with limited training

Authors:
M. Siu;M. Jonas;H. Gish
Affiliations:
BBN Technol./GTE Internetworking, Cambridge, UK;-;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Year:
1999

Citing 0
Cited 2

Evaluation of the robustness of the polynomial segment models to noisy environments with unsupervised adaptation

Speech Communication
Incremental word learning: Efficient HMM initialization and large margin discriminative adaptation

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

How to train a speech recognizer with limited amount of training data is of interest to many researcher. We describe how we use BBN's Byblos large vocabulary continuous speech recognition (LVCSR) system for the military air-traffic-control domain where we have less than an hour of training data. We investigate three ways to deal with the limited training data: (1) re-configure the LVCSR system to use fewer parameters, (2) incorporate out-of-domain data, and, (3) use pragmatic information, such as speaker identity and controller function to improve recognition performance. We compare the LVCSR performance to that of the tied-mixture recognizer that is designed for a limited vocabulary. We show that the reconfigured LVCSR system outperforms the tied-mixture system by 10% in absolute word error rate. When enough data is available per speaker, vocal tract length normalization and supervised adaptation techniques can further improve performance by 6% even for this domain with limited training. We also show that the use of out-of-domain data and pragmatic information, if available, can each further improve performance by 1-3%.