Audio-visual speech recognition based on AAM parameter and phoneme analysis of visual feature
PSIVT'11 Proceedings of the 5th Pacific Rim conference on Advances in Image and Video Technology - Volume Part I
Robust AAM-based audio-visual speech recognition against face direction changes
Proceedings of the 20th ACM international conference on Multimedia
Hi-index | 0.01 |
To solve the problem of extracting visual feature in lipreading, a new method based on DCT+LDA is proposed in this paper. First, Region of interest (ROI) is located based on the lip contour information, and then discrete cosine transformation (DCT) is performed on ROI, In order to extract the most discriminative feature vectors from the DCT coefficients and further reduce the feature dimensionality, linear discriminative analysis (LDA) is then introduced. Experiments were performed on speaker-dependent (SD) and speaker-independent (SI) bimodal database respectively, the experimental results showed that this algorithm achieved high recognition accuracy than traditional Zig-Zag DCT coefficients selection method and DCT+PCA algorithm. finally, this algorithm is also justified on our real-time lipreading platform.