Language model adaptation with additional text generated by machine translation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Multilingual Speech Processing
Multilingual Speech Processing
Joint-sequence models for grapheme-to-phoneme conversion
Speech Communication
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Transcribe '98 Proceedings of the Workshop on Partially Automated Techniques for Transcribing Naturally Occurring Continuous Speech
Rule-Based Automatic Phonetic Transcription for the Romanian Language
COMPUTATIONWORLD '09 Proceedings of the 2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns
Automatic speech recognition for under-resourced languages: application to Vietnamese language
IEEE Transactions on Audio, Speech, and Language Processing
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part I
Enhanced Rule-Based Phonetic Transcription for the Romanian Language
SYNASC '09 Proceedings of the 2009 11th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
Comparing SMT methods for automatic generation of pronunciation variants
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
A Romanian corpus for speech perception and automatic speech recognition
NEHIPISIC'11 Proceeding of 10th WSEAS international conference on electronics, hardware, wireless and optical communications, and 10th WSEAS international conference on signal processing, robotics and automation, and 3rd WSEAS international conference on nanotechnology, and 2nd WSEAS international conference on Plasma-fusion-nuclear physics
Automatic speech recognition for under-resourced languages: A survey
Speech Communication
Hi-index | 0.00 |
This study investigates the possibility of using statistical machine translation to create domain-specific language resources. We propose a methodology that aims to create a domain-specific automatic speech recognition (ASR) system for a low-resourced language when in-domain text corpora are available only in a high-resourced language. Several translation scenarios (both unsupervised and semi-supervised) are used to obtain domain-specific textual data. Moreover this paper shows that a small amount of manually post-edited text is enough to develop other natural language processing systems that, in turn, can be used to automatically improve the machine translated text, leading to a significant boost in ASR performance. An in-depth analysis, to explain why and how the machine translated text improves the performance of the domain-specific ASR, is also made at the end of this paper. As bi-products of this core domain-adaptation methodology, this paper also presents the first large vocabulary continuous speech recognition system for Romanian, and introduces a diacritics restoration module to process the Romanian text corpora, as well as an automatic phonetization module needed to extend the Romanian pronunciation dictionary.