A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
Exploiting syntactic structure for natural language modeling
Exploiting syntactic structure for natural language modeling
Probabilistic top-down parsing and language modeling
Computational Linguistics
Finite-state transducers in language and speech processing
Computational Linguistics
Partial parsing via finite-state cascades
Natural Language Engineering
Finite-state transducer cascades to extract named entities in texts
Theoretical Computer Science - Implementation and application automata
More accurate tests for the statistical significance of result differences
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Joint and conditional estimation of tagging and parsing models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
To improve the Mandarin large vocabulary continuous speech recognition (LVCSR), a unified framework based approach is introduced to exploit multi-level linguistic knowledge. In this framework, each knowledge source is represented by a Weighted Finite State Transducer (WFST), and then they are combined to obtain a so-called analyzer for integrating multi-level knowledge sources. Due to the uniform transducer representation, any knowledge source can be easily integrated into the analyzer, as long as it can be encoded into WFSTs. Moreover, as the knowledge in each level is modeled independently and the combination is processed in the model level, the information inherently in each knowledge source has a chance to be thoroughly exploited. By simulations, the effectiveness of the analyzer is investigated, and then a LVCSR system embedding the presented analyzer is evaluated. Experimental results reveal that this unified framework is an effective approach which significantly improves the performance of speech recognition with a 9.9% relative reduction of character error rate on the HUB-4 test set, a widely used Mandarin speech recognition task.