Continuous Malayalam speech recognition using Hidden Markov Models

Authors:
Anuj Mohamed;K. N. Ramachandran Nair
Affiliations:
Mahatma GandhiUniversity, Kottayam, Kerala, India;Viswajyothi College of Engineering and Technology, Kerala, India
Venue:
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Year:
2010

Citing 6
Cited 2

Fundamentals of speech recognition

Fundamentals of speech recognition
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Connectionist Speech Recognition: A Hybrid Approach

Connectionist Speech Recognition: A Hybrid Approach
Support vector machines for speech recognition

Support vector machines for speech recognition
The application of hidden Markov models in speech recognition

Foundations and Trends in Signal Processing
Applications of support vector machines to speech recognition

IEEE Transactions on Signal Processing

A comparative study of parametric coding and wavelet coding based feature extraction techniques in recognizing spoken words

Proceedings of the CUBE International Information Technology Conference
Pointing gesture recognition using compressed sensing for training data reduction

Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate and computationally efficient means of recognizing continuous speech has been a subject of research in recent years. This paper reports the development of a small vocabulary, speaker independent continuous Malayalam speech recognition system based on Hidden Markov Models (HMMs). Continuous density HMM, which is used in this work to model phonemes, represents the general case where the observation probability density functions (pdfs) are continuous. The observation pdf is approximated using a Gaussian mixture density. Mel-frequency Cepstral Coefficients (MFCC) method is used to extract acoustic features from the input signal. To represent temporal variations in the speech signal, the first and second order derivatives of MFCC are added to the set of static parameters. The training and decoding are performed by the Baum-Welch and Viterbi algorithms respectively. The corpus contains naturally and continuously spoken sentences with multiple pronunciations and speaker variations. On evaluation the proposed system has produced promising results with 94.67% word accuracy and 93.33% sentence correct.