Spoken term detection system based on combination of LVCSR and phonetic search

Authors:
Igor Szöke;Michal Fapšo;Martin Karafiát;Lukáš Burget;František Grézl;Petr Schwarz;Ondřej Glembek;Pavel Matějka;Jiří Kopecký;Jan "Honza" Černocký
Affiliations:
Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology;Speech@FIT, Faculty of Information Technology, Brno University of Technology
Venue:
MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Year:
2007

Citing 2
Cited 3

Indexing and search methods for spoken documents

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
The AMI speaker diarization system for NIST RT06s meeting data

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction

Direct posterior confidence for out-of-vocabulary spoken term detection

ACM Transactions on Information Systems (TOIS)
Comparison of methods for language-dependent and language-independent query-by-example spoken term detection

ACM Transactions on Information Systems (TOIS)
An approach for efficient open vocabulary spoken term detection

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper presents the Brno University of Technology (BUT) system for indexing and search of speech, combining LVCSR and phonetic approach. It brings a complete description of individual building blocks of the system from signal processing, through the recognizers, indexing and search until the normalization of detection scores. It also describes the data used in the first edition of NIST Spoken term detection (STD) evaluation. The results are presented on three US-English conditions - meetings, broadcast news and conversational telephone speech, in terms of detection error trade-off (DET) curves and term-weighted values (TWV) metrics defined by NIST.