Similarity Visualization for the Grouping of Forensic Speech Recordings

  • Authors:
  • Klara A. Weiand;Jos S. Bouten;Cor J. Veenman

  • Affiliations:
  • Intelligent Systems Lab, University of Amsterdam, Amsterdam, The Netherlands;Digital Technology & Biometrics Department, Netherlands Forensic Institute, The Hague, The Netherlands;Intelligent Systems Lab, University of Amsterdam, Amsterdam, The Netherlands and Digital Technology & Biometrics Department, Netherlands Forensic Institute, The Hague, The Netherlands

  • Venue:
  • IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

In a forensic phone wiretapping investigation, a major problem is to get the full picture of the speakers involved. Typically, the wiretapped speech recordings are grouped using a clustering tool. The main disadvantage of such an approach is that in a bootstrapped scenario grouping errors accumulate. In this paper, we propose a visual approach to find similar speech recordings that probably stem from the same speaker. We first model the speech recordings and define suitable similarity measures between recordings. Then, through an approximate 2-D visualization of the inter-speech, similarities the investigator can identify clear groups of recordings and recordings that are harder to differentiate. We did extensive experiments on phone data of 50 speakers with 2 recordings per speaker. We tested quality of the 2-D visualization in relation to original high dimensional similarities. It turned out that for the original high dimensional similarity measure the nearest recording is almost always the one from the same speaker. In the 2-D visualization, we achieved that on average for all speech recordings a recording of the same speaker is among the 10 nearest recordings.