Unlimited vocabulary speech recognition for agglutinative languages
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hi-index | 0.00 |
In this paper, we present a review of the latest developments in the Russian speech recognition research. Although the underlying speech technology is mostly language-independent, differences between languages with respect to their structure and grammar have substantial effect on the recognition systems performance. The Russian language has a complicated word formation system, which is characterized by a high degree of inflection and unrigidness of the word order. This greatly reduces the predictive power of the conventional language models and consequently increases the error rate. Current statistical approach to speech recognition requires large amount of both speech and text data. There exist several Russian speech databases and their descriptions are given in this paper. In addition, we describe and compare several speech recognition systems developed in Russia as well as in some other countries. Finally we suggest some promising directions for further research in Russian speech technology.