Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections

Authors:
Martha Larson;Manos Tsagkias;Jiyin He;Maarten Rijke
Affiliations:
Information and Communication Theory Group, EEMCS, Delft University of Technology, Delft, The Netherlands 2628 CD;ISLA, University of Amsterdam, Amsterdam, The Netherlands 1098 SJ;ISLA, University of Amsterdam, Amsterdam, The Netherlands 1098 SJ;ISLA, University of Amsterdam, Amsterdam, The Netherlands 1098 SJ
Venue:
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Year:
2009

Citing 3
Cited 2

Soft indexing of speech content for search in spoken documents

Computer Speech and Language
Robust techniques for organizing and retrieving spoken documents

EURASIP Journal on Applied Signal Processing
Access to recorded interviews: A research agenda

Journal on Computing and Cultural Heritage (JOCCH)

Towards methods for efficient access to spoken content in the ami corpus

Proceedings of the 2010 international workshop on Searching spontaneous conversational speech
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Errors in speech recognition transcripts have a negative impact on effectiveness of content-based speech retrieval and present a particular challenge for collections containing conversational spoken content. We propose a Global Semantic Distortion (GSD) metric that measures the collection-wide impact of speech recognition error on spoken content retrieval in a query-independent manner. We deploy our metric to examine the effects of speech recognition substitution errors. First, we investigate frequent substitutions, cases in which the recognizer habitually mis-transcribes one word as another. Although habitual mistakes have a large global impact, the long tail of rare substitutions has a more damaging effect. Second, we investigate semantically similar substitutions, cases in which the word spoken and the word recognized do not diverge radically in meaning. Similar substitutions are shown to have slightly less global impact than semantically dissimilar substitutions.