Audio-visual isolated words recognition for voice dialogue system

  • Authors:
  • Josef Chaloupka

  • Affiliations:
  • Institute of Information Technology, Technical University of Liberec, Liberec, Czech Republic

  • Venue:
  • COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This contribution is about experiments in audio-visual isolated words recognition. The results of these experiments will be used to improve our voice dialogue system, where visual speech recognition will be added. The voice dialogue systems can be used in train or bus stations (or elsewhere), where noise levels are relatively high, therefore the visual part of speech can improve the recognition rate mainly in noisy conditions. The audio-visual recognition of isolated words in our experiments was based on the technique of two-stream Hidden Markov Models (HMM) and on the HMM of single Czech phonemes and visemes. Different visual speech features and a different number of states and mixtures of HMM were evaluated in single tests. In the following experiments, isolated words were being recognized after training of the HMM and babble noise was added in the successive steps to the acoustic speech signal.