Ego noise cancellation of a robot using missing feature masks

  • Authors:
  • Gökhan Ince;Kazuhiro Nakadai;Tobias Rodemann;Hiroshi Tsujino;Jun-Ichi Imura

  • Affiliations:
  • Honda Research Institute Japan Co., Ltd., Wako-shi, Japan 351-0188 and Dept. of Mechanical and Environmental Informatics, Tokyo Institute of Technology, O-okayama, Meguro-ku, Japan 152-8552;Honda Research Institute Japan Co., Ltd., Wako-shi, Japan 351-0188 and Dept. of Mechanical and Environmental Informatics, Tokyo Institute of Technology, O-okayama, Meguro-ku, Japan 152-8552;Honda Research Institute Europe GmbH, Offenbach, Germany 63073;Honda Research Institute Japan Co., Ltd., Wako-shi, Japan 351-0188;Dept. of Mechanical and Environmental Informatics, Tokyo Institute of Technology, O-okayama, Meguro-ku, Japan 152-8552

  • Venue:
  • Applied Intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an architecture that gives a robot the capability to recognize speech by cancelling ego noise, even while the robot is moving. The system consists of three blocks: (1) a multi-channel noise reduction block, comprising consequent stages of microphone-array-based sound localization, geometric source separation and post-filtering; (2) a single-channel noise reduction block utilizing template subtraction; and (3) an automatic speech recognition block. In this work, we specifically investigate a missing feature theory-based automatic speech recognition (MFT-ASR) approach in block (3). This approach makes use of spectro-temporal elements derived from (1) and (2) to measure the reliability of the acoustic features, and generates masks to filter unreliable acoustic features. We then evaluated this system on a robot using word correct rates. Furthermore, we present a detailed analysis of recognition accuracy to determine optimal parameters. Implementation of the proposed MFT-ASR approach resulted in significantly higher recognition performance than single or multi-channel noise reduction methods.