Score distribution based term specific thresholding for spoken term detection
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Contextual information improves OOV detection in speech
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Novel methods for query selection and query combination in query-by-example spoken term detection
Proceedings of the 2010 international workshop on Searching spontaneous conversational speech
Direct posterior confidence for out-of-vocabulary spoken term detection
Proceedings of the 2010 international workshop on Searching spontaneous conversational speech
Learning sub-word units for open vocabulary speech recognition
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Direct posterior confidence for out-of-vocabulary spoken term detection
ACM Transactions on Information Systems (TOIS)
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
The spoken term detection (STD) task aims to return relevant segments from a spoken archive that contain the query terms whether or not they are in the system vocabulary. This paper focuses on pronunciation modeling for Out-of-Vocabulary (OOV) terms which frequently occur in STD queries. The STD system described in this paper indexes word-level and sub-word level lattices or confusion networks produced by an LVCSR system using Weighted Finite State Transducers (WFST).We investigate the inclusion of n-best pronunciation variants for OOV terms (obtained from letter-to-sound rules) into the search and present the results obtained by indexing confusion networks as well as lattices. The following observations are worth mentioning: phone indexes generated from sub-words represent OOVs well and too many variants for the OOV terms degrade performance if pronunciations are not weighted.