Phrase-based query degradation modeling for vocabulary-independent ranked utterance retrieval

Authors:
J. Scott Olsson;Douglas W. Oard
Affiliations:
Johns Hopkins University, Baltimore, MD;University of Maryland, College Park, MD
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 7
Cited 2

Subword-based approaches for spoken document retrieval

Speech Communication
Inference of Variable-length Acoustic Units for Continuous Speech Recognition

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Probabilistic methods for searching ocr-degraded arabic text

Probabilistic methods for searching ocr-degraded arabic text
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Rapid resource transfer for multilingual natural language processing

Rapid resource transfer for multilingual natural language processing
Error correction vs. query garbling for Arabic OCR document retrieval

ACM Transactions on Information Systems (TOIS)
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Combining LVCSR and vocabulary-independent ranked utterance retrieval for robust speech search

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new approach to ranking speech utterances by a system's confidence that they contain a spoken word. Multiple alternate pronunciations, or degradations, of a query word's phoneme sequence are hypothesized and incorporated into the ranking function. We consider two methods for hypothesizing these degradations, the best of which is constructed using factored phrase-based statistical machine translation. We show that this approach is able to significantly improve upon a state-of-the-art baseline technique in an evaluation on held-out speech. We evaluate our systems using three different methods for indexing the speech utterances (using phoneme, phoneme multigram, and word recognition), and find that degradation modeling shows particular promise for locating out-of-vocabulary words when the underlying indexing system is constructed with standard word-based speech recognition.