Using Soundex codes for indexing names in ASR documents

  • Authors:
  • Hema Raghavan;James Allan

  • Affiliations:
  • -;-

  • Venue:
  • SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we highlight the problems that arise due to variations of spellings of names that occur in text, as a result of which links between two pieces of text where the same name is spelt differently may be missed. The problem is particularly pronounced in the case of ASR text. We propose the use of approximate string matching techniques to normalize names in order to overcome the problem. We show how we could achieve an improvement if we could tag names with reasonable accuracy in ASR.