Fundamentals of digital image processing
Fundamentals of digital image processing
Multilingual phone models for vocabulary-independent speech recognition tasks
Speech Communication
Language-independent and language-adaptive acoustic modeling for speech recognition
Speech Communication
In-Service Adaptation of Multilingual Hidden-Markov-Models
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Multilingual Speech Processing
Multilingual Speech Processing
A study on multilingual acoustic modeling for large vocabulary ASR
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Hi-index | 0.00 |
In this paper, we propose a technique to derive robust features for multilingual acoustic modeling using hidden Markov model---Gaussian mixture models (HMM-GMM). We achieve this by discriminatively combining the phonetic contexts of the target languages (languages in the multilingual system). Phonetic context is captured using wide temporal context of the features, and the dimensionality of the resulting feature set is reduced to suit the HMM-GMM implementation using a neural network with a bottle-neck in one of the hidden layers. The output before the non-linearity at the bottle-neck layer of the neural network is the new feature. Since the features are optimized for the target languages in the multilingual recognizer, they are referred to as Target Languages Oriented Features (TLOF).We perform our experiments for two of the most widely spoken Indian languages, Hindi and Tamil. TLOF offers significant performance improvements over both monolingual and multilingual phone recognizers using Mel frequency cepstral coefficients (MFCC). This emphasizes that TLOF can help share data across languages.It was also seen that TLOF can enhance the performance of monolingual acoustic models, compared to systems using MFCC.