A unified neural-network-based speaker localization technique

  • Authors:
  • G. Arslan;F. A. Sakarya

  • Affiliations:
  • Dept. of Electr. & Comput. Eng., Texas Univ., Austin, TX;-

  • Venue:
  • IEEE Transactions on Neural Networks
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Locating and tracking a speaker in real time using microphone arrays is important in many applications such as hands-free video conferencing, speech processing in large rooms, and acoustic echo cancellation. A speaker can be moving from the far field to the near field of the array, or vice versa. Many neural-network-based localization techniques exist, but they are applicable to either far-field or near-field sources, and are computationally intensive for real-time speaker localization applications because of the wide-band nature of the speech. We propose a unified neural-network-based source localization technique, which is simultaneously applicable to wide-band and narrow-band signal sources that are in the far field or near field of a microphone array. The technique exploits a multilayer perceptron feedforward neural network structure and forms the feature vectors by computing the normalized instantaneous cross-power spectrum samples between adjacent pairs of sensors. Simulation results indicate that our technique is able to locate a source with an absolute error of less than 3.5° at a signal-to-noise ratio of 20 dB and a sampling rate of 8000 Hz at each sensor