Robust voice activity detection for social sensing

  • Authors:
  • Sebastian Feese;Gerhard Tröster

  • Affiliations:
  • ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland

  • Venue:
  • Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The speech modality is a rich source of personal information. As such, speech detection is a fundamental function of many social sensing applications. Simply the amount of speech present in our surroundings can give indications about our socialbility and communication patterns. In this work, we present and evaluate a speech detection approach utilizing dictionary learning and sparse signal representation. Transforming the noisy audio data to the sparse representation with a dictionary learned from clean speech data, we show that speech and non speech can be discriminated even in low signal-to-noise conditions with up to 92% accuracy. In addition to an evaluation with simulated data, we evaluate the algorithm on a real-world data set recorded during firefighting missions. We show, that speech activity of firefighters can be detected with 85% accuracy when using a smartphone that was placed in the firefighting jacket.