Contextual information improves OOV detection in speech

Authors:
Carolina Parada;Mark Dredze;Denis Filimonov;Frederick Jelinek
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;University of Maryland, College Park, MD;Johns Hopkins University, Baltimore, MD
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 10
Cited 3

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Modelling out-of-vocabulary words for robust speech recognition

Modelling out-of-vocabulary words for robust speech recognition
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Using conditional random fields for sentence boundary detection in speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Vocabulary independent spoken term detection

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A new method for OOV detection using hybrid word/fragment system

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Effect of pronounciations on OOV queries in spoken term detection

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Linguistically-motivated sub-word modeling with applications to speech recognition

Linguistically-motivated sub-word modeling with applications to speech recognition
Self-training PCFG grammars with latent annotations across languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A joint language model with fine-grain syntactic tags

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3

Learning sub-word units for open vocabulary speech recognition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Correcting phoneme recognition errors in learning word pronunciation through speech interaction

Speech Communication
An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Out-of-vocabulary (OOV) words represent an important source of error in large vocabulary continuous speech recognition (LVCSR) systems. These words cause recognition failures, which propagate through pipeline systems impacting the performance of downstream applications. The detection of OOV regions in the output of a LVCSR system is typically addressed as a binary classification task, where each region is independently classified using local information. In this paper, we show that jointly predicting OOV regions, and including contextual information from each region, leads to substantial improvement in OOV detection. Compared to the state-of-the-art, we reduce the missed OOV rate from 42.6% to 28.4% at 10% false alarm rate.