A system for information retrieval from large records of czech spoken data

Authors:
Jan Nouza;Jindřich Žďánský;Petr Červa;Jan Kolorenč
Affiliations:
SpeechLab, Technical University of Liberec, Liberec 1, Czech Republic;SpeechLab, Technical University of Liberec, Liberec 1, Czech Republic;SpeechLab, Technical University of Liberec, Liberec 1, Czech Republic;SpeechLab, Technical University of Liberec, Liberec 1, Czech Republic
Venue:
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Year:
2006

Citing 0
Cited 5

Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems

Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Underdetermined Blind Source Separation Using Linear Separation System

Multimodal Signals: Cognitive and Algorithmic Issues
Adapting lexical and language models for transcription of highly spontaneous spoken Czech

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Challenges in speech processing of slavic languages (case studies in speech recognition of czech and slovak)

COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the paper we describe a complex multi-level system that serves for automatic search in large records of Czech spoken data It includes modules for audio signal segmentation, speaker identification and adaptation, speech recognition and full-text search The search can focus both on key-words and key-speakers The transcription accuracy is about 79 % (for broadcast programs), search accuracy about 90 % Due to its distributed platform, the system can operate in almost real-time.