Performance of phase transform for detecting sound sources with microphone arrays in reverberant and noisy environments

  • Authors:
  • Kevin D. Donohue;Jens Hannemann;Henry G. Dietz

  • Affiliations:
  • Center for Visualization and Virtual Environments, University of Kentucky, 1 Quality St. Ste. 800, Lexington, KY 40507-1464, USA;Center for Visualization and Virtual Environments, University of Kentucky, 1 Quality St. Ste. 800, Lexington, KY 40507-1464, USA;Center for Visualization and Virtual Environments, University of Kentucky, 1 Quality St. Ste. 800, Lexington, KY 40507-1464, USA

  • Venue:
  • Signal Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.09

Visualization

Abstract

The performance of sound source location (SSL) algorithms with microphone arrays can be enhanced by processing signals prior to the delay and sum operation. The phase transform (PHAT) has been shown to improve SSL images, especially in reverberant environments. This paper introduces a modification, referred to as the PHAT-@b transform, that varies the degree of spectral magnitude information used by the transform through a single parameter. Performance results are computed using a Monte Carlo simulation of an eight element perimeter array with a receiver operating characteristic (ROC) analysis for detecting single and multiple sound sources. In addition, a Fisher's criterion performance measure is also computed for target and noise peak separability and compared to the ROC results. Results show that the standard PHAT significantly improves detection performance for broadband signals especially in high levels of reverberation noise, and to a lesser degree for noise from other coherent sources. For narrowband targets the PHAT typically results in significant performance degradation; however, the PHAT-@b can achieve performance improvements for both narrowband and broadband signals. Finally, the performance for real speech signal samples is examined and shown to exhibit properties similar to both the simulated broad and narrowband cases, suggesting the use of @b values between 0.5 and 0.7 for array applications with general signals.