SpeakerSense: energy efficient unobtrusive speaker identification on mobile phones

  • Authors:
  • Hong Lu;A. J. Bernheim Brush;Bodhi Priyantha;Amy K. Karlson;Jie Liu

  • Affiliations:
  • Dept. of Computer Science, Dartmouth College, Hanover, NH and Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA

  • Venue:
  • Pervasive'11 Proceedings of the 9th international conference on Pervasive computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatically identifying the person you are talking with using continuous audio sensing has the potential to enable many pervasive computing applications from memory assistance to annotating life logging data. However, a number of challenges, including energy efficiency and training data acquisition, must be addressed before unobtrusive audio sensing is practical on mobile devices. We built SpeakerSense, a speaker identification prototype that uses a heterogeneous multi-processor hardware architecture that splits computation between a low power processor and the phone's application processor to enable continuous background sensing with minimal power requirements. Using SpeakerSense, we benchmarked several system parameters (sampling rate, GMM complexity, smoothing window size, and amount of training data needed) to identify thresholds that balance computation cost with performance. We also investigated channel compensation methods that make it feasible to acquire training data from phone calls and an automatic segmentation method for training speaker models based on one-to-one conversations.