Information Retrieval
Spanish recognizer of continuously spelled names over the telephone
Speech Communication
Multiple approaches to robust speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
Improved spelling recognition using a tree-based fast lexical match
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
HCRF-UBM approach for text-independent speaker identification
Computers & Mathematics with Applications
Hi-index | 0.09 |
Spelling speech recognition can be applied for several purposes including enhancement of speech recognition systems and implementation of name retrieval systems. This paper presents an approach to construct three recognizers for the three commonly-used Thai spelling methods based on hidden Markov models (HMMs). The Thai phonetic characteristics, alphabet system and spelling methods are analyzed. For the first spelling method, two recognizers, each trained from a small spelling corpus and an existing large continuous speech corpus, are explored. To solve utterance speed difference between spelling utterances and continuous speech utterances, the adjustment of utterance speed is taken into account. Two alternative language models, bigram and trigram, are investigated to evaluate the performance of spelling speech recognition under three different environments: close-type, open-type and mix-type language models. For the first spelling method, our approach achieves up to 93.09% letter correct rate (LCR) and 92.45% letter accuracy (LA) when the language model is trigram under the mix-type environment and the acoustic model is trained from the small spelling corpus. Under the same conditions, we obtained 81.12% LCR and 76.32% LA for the second spelling method and 78.47% LCR and 71.75% LA for the third spelling method. By analyzing the results, it was found that the main source of the errors was letter substitution, which is mostly triggered by the confusion of similar consonant phones and the confusion of short/long vowel pairs.