Investigating cross-language speech retrieval for a spontaneous conversational speech collection

  • Authors:
  • Diana Inkpen;Muath Alzghool;Gareth J. F. Jones;Douglas W. Oard

  • Affiliations:
  • University of Ottawa, Ottawa, Ontario, Canada;University of Ottawa, Ottawa, Ontario, Canada;Dublin City University, Dublin, Ireland;University of Maryland, College Park, MD

  • Venue:
  • NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into phonetic substrings to help ameliorate transcription errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources yields effective CL-SR.