Using tone information in Cantonese continuous speech recognition
ACM Transactions on Asian Language Information Processing (TALIP)
Improved mandarin speech recognition by lattice rescoring with enhanced tone models
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Automatic detection of tone mispronunciation in mandarin
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
SAICSIT '10 Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists
Support of Android lab modules for embedded system curriculum
WESE '10 Proceedings of the 2010 Workshop on Embedded Systems Education
Hi-index | 0.00 |
Tone recognition has been a basic but important task for speechrecognition and assessment of tonal languages, such as MandarinChinese. Most previously proposed approaches adopt a two-stepapproach where syllables within an utterance are identified viaforced alignment first, and tone recognition using a variety ofclassifiers---such as neural networks, Gaussian mixture models(GMM), hidden Markov models (HMM), support vector machines(SVM)---is then performed on each segmented syllable to predict itstone. However, forced alignment does not always generate accuratesyllable boundaries, leading to unstable voiced-unvoiced detectionand deteriorating performance in tone recognition. Aiming toalleviate this problem, we propose a robust approach called ToneRecognition Using Extended Segments (TRUES) for HMM-basedcontinuous tone recognition. The proposed approach extracts anunbroken pitch contour from a given utterance based on dynamicprogramming over time-domain acoustic features of average magnitudedifference function (AMDF). The pitch contour of each syllable isthen extended for tri-tone HMM modeling, such that the influencefrom inaccurate syllable boundaries is lessened. Our experimentalresults demonstrate that the proposed TRUES achieves 49.13%relative error rate reduction over that of the recently proposedsupratone modeling, which is deemed the state of the art of tonerecognition that outperforms several previously proposedapproaches. The encouraging improvement demonstrates theeffectiveness and robustness of the proposed TRUES, as well as thecorresponding pitch determination algorithm which produces unbrokenpitch contours.