Context dependent modeling of phones in continuous speech using decision trees

Authors:
L. R. Bahl;P. V. de Soutza;P. S. Gopalakrishnan;D. Nahamoo;M. A. Picheny
Affiliations:
-;-;-;-;-
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1991

Citing 3
Cited 7

Multilevel decoding for very-large-size-dictionary speech recognition

IBM Journal of Research and Development
Automatic discovery of contextual factors describing phonological variation

HLT '89 Proceedings of the workshop on Speech and Natural Language
An iterative 'flip-flop' approximation of the most informative split in the construction of decision trees

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Tree-based state tying for high accuracy acoustic modelling

HLT '94 Proceedings of the workshop on Human Language Technology
High-accuracy large-vocabulary speech recognition using mixture tying and consistency modeling

HLT '94 Proceedings of the workshop on Human Language Technology
Eigenvalues Driven Gaussian Selection in continuous speech recognition using HMMs with full covariance matrices

Applied Intelligence
Evolution of the ASR decoder design

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Task independent wordspotting using decision tree based allophone clustering

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Automatic rule extraction for modeling pronunciation variation

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Direct construction of compact context-dependency transducers from data

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a continuous speech recognition system it is important to model the context dependent variations in the pronunciations of words. In this paper we present an automatic method for modeling phonological variation using decision trees. For each phone we construct a decision tree that specifies the acoustic realization of the phone as a function of the context in which it appears. Several thousand sentences from a natural language corpus spoken by several talkers are used to construct these decision trees. Experimental results on a 5000-word vocabulary natural language speech recognition task are presented.