A hierarchical duration model for speech recognition based on the ANGIE framework
Speech Communication
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Attention shifting for parsing speech
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Discriminative syntactic language modeling for speech recognition
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Speech recognition of Czech: inclusion of rare words helps
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Edit-distance of weighted automata
CIAA'02 Proceedings of the 7th international conference on Implementation and application of automata
Morph-based speech recognition and modeling of out-of-vocabulary words across languages
ACM Transactions on Speech and Language Processing (TSLP)
Information retrieval test collection for searching spontaneous Czech speech
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
IEEE Transactions on Audio, Speech, and Language Processing
Efficient training of discriminative language models by sample selection
Speech Communication
Overview of the CLEF-2006 cross-language speech retrieval track
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Hi-index | 0.00 |
This paper presents a corrective model for speech recognition of inflected languages. The model, based on a discriminative framework, incorporates word n-grams features as well as factored morphological features, providing error reduction over the model based solely on word n-gram features. Experiments on a large vocabulary task, namely the Czech portion of the MALACH corpus, demonstrate performance gain of about 1.1--1.5% absolute in word error rate, wherein morphological features contribute about a third of the improvement. A simple feature selection mechanism based on X2 statistics is shown to be effective in reducing the number of features by about 70% without any loss in performance, making it feasible to explore yet larger feature spaces.