Dragon systems resource management benchmark results—February 1991

Authors:
James Baker;Janet Baker;Paul Bamberg;Larry Gillick;Lori Lamel;Robert Roth;Francesco Scattone;Dean Sturtevant;Ousmane Ba;Richard Benedict
Affiliations:
-;-;-;-;-;-;-;-;-;-
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1991

Citing 4
Cited 3

A tree-trellis based fast search for finding the N Best sentence hypotheses in continuous speech recognition

HLT '90 Proceedings of the workshop on Speech and Natural Language
The dragon continuous speech recognition system: a real-time implementation

HLT '90 Proceedings of the workshop on Speech and Natural Language
Phoneme-in-context modeling for dragon's continuous speech recognizer

HLT '90 Proceedings of the workshop on Speech and Natural Language
A rapid match algorithm for continuous speech recognition

HLT '90 Proceedings of the workshop on Speech and Natural Language

Session 2: DARPA resource management and ATIS benchmark test poster session

HLT '91 Proceedings of the workshop on Speech and Natural Language
Large vocabulary recognition of Wall Street Journal sentences at Dragon Systems

HLT '91 Proceedings of the workshop on Speech and Natural Language
Large vocabulary continuous speech recognition of wall street journal data

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present preliminary results obtained at Dragon Systems on the Resource Management benchmark task. The basic conceptual units of our system are Phonemes-in-Context (PICs), which are represented as Hidden Markov Models, each of which is expressed as a sequence of Phonetic Elements (PELs). The PELs corresponding to a given phoneme constitute a kind of alphabet for the representation of PICs.For the speaker-dependent tests, two basic methods of training the acoustic models were investigated. The first method of training the Resource Management models is to re-estimate the models for each test speaker from that speaker's training data, keeping the PEL spellings of the PICs fixed. The second approach is to use the re-estimated models from the first method to derive a segmentation of the training data, then to respell the PICs in a largely speaker-dependent manner in order to improve the representation of speaker differences. A full explanation of these methods is given, as are results using each method.In addition to reporting on two different training strategies, we discuss N-Best results. The N-Best algorithm is a modification of the algorithm proposed by Soong and Huang at the June 1990 workshop. This algorithm runs as a post-processing step and uses an A*-search (an algorithm also known as a 'stack decoder').