Genetic algorithm-based improvement of robot hearing capabilities in separating and recognizing simultaneous speech signals

Authors:
Shun’ichi Yamamoto;Kazuhiro Nakadai;Mikio Nakano;Hiroshi Tsujino;Jean-Marc Valin;Ryu Takeda;Kazunori Komatani;Tetsuya Ogata;Hiroshi G. Okuno
Affiliations:
Graduate School of Informatics, Kyoto University, Japan;Honda Research Institute Japan Co., Ltd., Japan;Honda Research Institute Japan Co., Ltd., Japan;Honda Research Institute Japan Co., Ltd., Japan;CSIRO ICT Centre, Ausralia;Graduate School of Informatics, Kyoto University, Japan;Graduate School of Informatics, Kyoto University, Japan;Graduate School of Informatics, Kyoto University, Japan;Graduate School of Informatics, Kyoto University, Japan
Venue:
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Year:
2006

Citing 4
Cited 3

Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Sound and Visual Tracking for Humanoid Robot

Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
Distance-based dynamic interaction of humanoid robot with multiple people

IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence

Missing-feature-theory-based robust simultaneous speech recognition system with non-clean speech acoustic model

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Recognition of simultaneous speech by estimating reliability of separated signals for robot audition

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Blind source separation with parameter-free adaptive step-size method for robot audition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since a robot usually hears a mixture of sounds, in particular, simultaneous speech signals, it should be able to localize, separate, and recognize each speech signal. Since separated speech signals suffer from spectral distortion, normal automatic speech recognition (ASR) may fail in recognizing such distorted speech signals. Yamamoto et al. proposed using the Missing Feature Theory to mask corrupt features in ASR, and developed the automatic missing-feature-mask generation (AMG) system by using information obtained by sound source separation (SSS). Our evaluations of recognition performance of the system indicate possibilities for improving it by optimizing many of its parameters. We used genetic algorithms to optimize these parameters. Each chromosome consists of a set of parameters for SSS and AMG, and each chromosome is evaluated by recognition rate of separated sounds. We obtained an optimized sets of parameters for each distance (from 50 cm to 250 cm by 50 cm) and direction (30, 60, and 90 degree intervals) for two simultaneous speech signals. The average isolated word recognition rates ranged from 84.9% to 94.7%.