A framework of a mechanical translation between Japanese and English by analogy principle
Proc. of the international NATO symposium on Artificial and human intelligence
Poor estimates of context are worse than none
HLT '90 Proceedings of the workshop on Speech and Natural Language
Automatic stochastic tagging of natural language texts
Computational Linguistics
Statistical methods for speech recognition
Statistical methods for speech recognition
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Target-Text Mediated Interactive Machine Translation
Machine Translation
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model
Computational Linguistics
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Improving part-of-speech tagging using lexicalized HMMs
Natural Language Engineering
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
POS-tagger for English-Vietnamese bilingual corpus
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Speeding up target-language driven part-of-speech tagger training for machine translation
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Open-Source portuguese–spanish machine translation
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Inferring shallow-transfer machine translation rules from small parallel corpora
Journal of Artificial Intelligence Research
Apertium: a free/open-source platform for rule-based machine translation
Machine Translation
Hi-index | 0.00 |
Although corpus-based approaches to machine translation (MT) are growing in interest, they are not applicable when the translation involves less-resourced language pairs for which there are no parallel corpora available; in those cases, the rule-based approach is the only applicable solution. Most rule-based MT systems make use of part-of-speech (PoS) taggers to solve the PoS ambiguities in the source-language texts to translate; those MT systems require accurate PoS taggers to produce reliable translations in the target language (TL). The standard statistical approach to PoS ambiguity resolution (or tagging) uses hidden Markov models (HMM) trained in a supervised way from hand-tagged corpora, an expensive resource not always available, or in an unsupervised way through the Baum-Welch expectation-maximization algorithm; both methods use information only from the language being tagged. However, when tagging is considered as an intermediate task for the translation procedure, that is, when the PoS tagger is to be embedded as a module within an MT system, information from the TL can be (unsupervisedly) used in the training phase to increase the translation quality of the whole MT system. This paper presents a method to train HMM-based PoS taggers to be used in MT; the new method uses not only information from the source language (SL), as general-purpose methods do, but also information from the TL and from the remaining modules of the MT system in which the PoS tagger is to be embedded. We find that the translation quality of the MT system embedding a PoS tagger trained in an unsupervised manner through this new method is clearly better than that of the same MT system embedding a PoS tagger trained through the Baum-Welch algorithm, and comparable to that obtained by embedding a PoS tagger trained in a supervised way from hand-tagged corpora.