Reduced feature-set based parallel CHMM speech recognition systems
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Spoken language analysis, modeling and recognition-statistical and adaptive connectionist approaches
Enhancing Speaker Discrimination at the Feature Level
Speaker Classification I
MLP internal representation as discriminative features for improved speaker recognition
NOLISP'05 Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing
Hi-index | 0.00 |
We evaluate the performance of several feature sets on the Aurora task as defined by ETSI. We show that after a non-linear transformation, a number of features can be effectively used in a HMM-based recognition system. The non-linear transformation is computed using a neural network which is discriminatively trained on the phonetically labeled (forcibly aligned) training data. A combination of the non-linearly transformed PLP (perceptive linear predictive coefficients), MSG (modulation filtered spectrogram) and TRAP (temporal pattern) features yields a 63% improvement in error rate as compared to baseline me frequency cepstral coefficients features. The use of the non-linearly transformed RASTA-like features, with system parameters scaled down to take into account the ETSI imposed memory and latency constraints, still yields a 40% improvement in error rate.