Audio-visual speech recognition using red exclusion and neural networks

  • Authors:
  • Trent W. Lewis;David M. W. Powers

  • Affiliations:
  • Flinders University of South Australia, Adelaide, South Australia 5001;Flinders University of South Australia, Adelaide, South Australia 5001

  • Venue:
  • ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic speech recognition (ASR) performs well under restricted conditions, but performance degrades in noisy environments. Audio-Visual Speech Recognition (AVSR) combats this by incorporating a visual signal into the recognition. This paper briefly reviews the contribution of psycholinguistics to this endeavour and the recent advances in machine AVSR. An important first step in AVSR is that of feature extraction from the mouth region and a technique developed by the authors is breifly presented. This paper examines examine how useful this extraction technique in combination with several integration arhitectures is at the given task, demonstrates that vision does infact assist speech recognition when used in a linguistically guided fashion, and gives insight remaining issues.