Turkish Broadcast News Transcription and Retrieval

Authors:
E. Arisoy;D. Can;S. Parlak;H. Sak;M. Saraclar
Affiliations:
Dept. of Electr. & Electron. Eng., Bogazici Univ., Istanbul;-;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2009

Citing 0
Cited 7

Spoken information retrieval for turkish broadcast news

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Spoken proper name retrieval for limited resource languages using multilingual hybrid representations

IEEE Transactions on Audio, Speech, and Language Processing
Improved recognition of spontaneous Hungarian speech: morphological and acoustic modeling techniques for a less resourced task

IEEE Transactions on Audio, Speech, and Language Processing
EMMA: a novel Evaluation Metric for Morphological Analysis

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Multi-lingual fingerspelling recognition for handicapped kiosk

Pattern Recognition and Image Analysis
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval
Automatic recognition fingerspelling gestures in multiple languages for a communication interface for the disabled

Pattern Recognition and Image Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper summarizes our recent efforts for building a Turkish Broadcast News transcription and retrieval system. The agglutinative nature of Turkish leads to a high number of out-of-vocabulary (OOV) words which in turn lower automatic speech recognition (ASR) accuracy. This situation compromises the performance of speech retrieval systems based on ASR output. Therefore using a word-based ASR is not adequate for transcribing speech in Turkish. To alleviate this problem, various sub-word-based recognition units are utilized. These units solve the OOV problem with moderate size vocabularies and perform even better than a 500 K word vocabulary as far as recognition accuracy is concerned. As a novel approach, the interaction between recognition units, words and sub-words, and discriminative training is explored. Sub-word models benefit from discriminative training more than word models do, especially in the discriminative language modeling framework. For speech retrieval, a spoken term detection system based on automata indexation is utilized. As with transcription, retrieval performance is measured under various schemes incorporating words and sub-words. Best results are obtained using a cascade of word and sub-word indexes together with term-specific thresholding.