Audio hot spotting and retrieval using multiple features

Authors:
Qian Hu;Fred Goodman;Stanley Boykin;Randy Fish;Warren Greiff
Affiliations:
MITRE Corporation;MITRE Corporation;MITRE Corporation;MITRE Corporation;MITRE Corporation
Venue:
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Year:
2004

Citing 3
Cited 0

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Integrated technologies for indexing spoken language

Communications of the ACM
Towards robustness to fast speech in ASR

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports our on-going efforts to exploit multiple features derived from an audio stream using source material such as broadcast news, teleconferences, and meetings. These features are derived from algorithms including automatic speech recognition, automatic speech indexing, speaker identification, prosodic and audio feature extraction. We describe our research prototype -- the Audio Hot Spotting System -- that allows users to query and retrieve data from multimedia sources utilizing these multiple features. The system aims to accurately find segments of user interest, i.e., audio hot spots within seconds of the actual event. In addition to spoken keywords, the system also retrieves audio hot spots by speaker identity, word spoken by a specific speaker, a change of speech rate, and other non-lexical features, including applause and laughter. Finally, we discuss our approach to semantic, morphological, phonetic query expansion to improve audio retrieval performance and to access cross-lingual data.