Computational Linguistics - Special issue on web as corpus
Computational Linguistics - Special issue on web as corpus
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Base Noun Phrase translation using web data and the EM algorithm
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A general regression technique for learning transductions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Further meta-evaluation of machine translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Hi-index | 0.00 |
This paper presents a novel regression framework to model both the translational equivalence problem and the parameter estimation problem in statistical machine translation (SMT). The proposed method kernelizes the training process by formulating the translation problem as a linear mapping among source and target word chunks (word n-grams of various length), which yields a regression problem with vector outputs. A kernel ridge regression model and a one-class classifier called maximum margin regression are explored for comparison, between which the former is proved to perform better in this task. The experimental results conceptually demonstrate its advantages of handling very high-dimensional features implicitly and flexibly. However, it shares the common drawback of kernel methods, i.e. the lack of scalability. For real-world application, a more practical solution based on locally linear regression hyperplane approximation is proposed by using online relevant training examples subsetting. In addition, we also introduce a novel way to integrate language models into this particular machine translation framework, which utilizes the language model as a penalty item in the objective function of the regression model, since its n-gram representation exactly matches the definition of our feature space.