Automatic discovery of contextual factors describing phonological variation
HLT '89 Proceedings of the workshop on Speech and Natural Language
Improved acoustic modeling for continuous speech recognition
HLT '90 Proceedings of the workshop on Speech and Natural Language
Decision trees for phonological rules in continuous speech
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
CMU robust vocabulary-independent speech recognition system
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Subphonetic modeling with Markov states: senone
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Benchmark tests for the DARPA Spoken Language Program
HLT '93 Proceedings of the workshop on Human Language Technology
Hi-index | 0.01 |
In large-vocabulary speech recognition, there are always new triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context independent monophones. We propose to use decision-tree based senones to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of each phone, and the leaves of the trees constitute the senone codebook. A Markov state of any triphone traverses the corresponding tree until a leaf to find the senone it is to be associated with. We use the DARPA 5,000-word speaker-independent Wall Street Journal dictation task to evaluate the proposed method. The word error rate is reduced by more than 10% when unseen triphones are modeled by the decision-tree based senones.