Fundamentals of speech recognition
Fundamentals of speech recognition
Hi-index | 0.00 |
This paper presents two approaches to speaker text dependent verification using a combined VQ-DTW method and a combined TESPAR-DTW method. In first case VQ is used in the training stage to generate a model for each speaker from training utterances as well as in the test stage to compute a model for the test utterance. Before applying DTW for final distance, the test sequences of centroids is rearranged such as in the first position is put that test centroid which is closest to first centroid in the reference sequence and so on. The second approach aim to avoid the complicated process associated with TESPAR alphabet generation. We used DTW to align sequences of epochs in order to generate a verification decision. TESPAR coding is used to generate speech features (sequences of epochs, each with specific shape and duration) for each speaker utterance, in training and testing stage. DTW is used for successive alignments to compute a speaker model in training stage and also in testing stage in order to evaluate a distance between the speaker's models and the test utterance model.