Voice search of structured media data

Authors:
Young-In Song;Ye-Yi Wang;Yun-Cheng Ju;Mike Seltzer;Ivan Tashev;Alex Acero
Affiliations:
Korea University, Korea;Microsoft Research, USA;Microsoft Research, USA;Microsoft Research, USA;Microsoft Research, USA;Microsoft Research, USA
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 2

Hybrid query by humming and metadata search system (HQMS)

Proceedings of the 8th International Conference on Frontiers of Information Technology
How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on spoken queries compared to the baseline system. Furthermore, a phonetic similarity model has been introduced to compensate speech recognition errors, which has improved the end-to-end search accuracy consistently across different levels of speech recognition accuracy.