Deterministic part-of-speech tagging with finite-state transducers
Computational Linguistics
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
This work focuses on a hybrid machine translation system from Spanish into Catalan called SisHiTra. In particular, we focus on its word translation disambiguation module, which has to decide on the correct translation of each ambiguous input word in accordance with its context. We propose the use of statistical pattern recognition techniques for this task and, in particular, multinomial Naive Bayes text classifiers. Extensive empirical results on the use of these classifiers are presented, in which the influence of the window (context) size and parameter smoothing are carefully studied.