An english-hindi statistical machine translation system

Authors:
Raghavendra Udupa U.;Tanveer A. Faruquie
Affiliations:
IBM India Research Lab, New Delhi, India;IBM India Research Lab, New Delhi, India
Venue:
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Year:
2004

Citing 9
Cited 1

Class-based n-gram models of natural language

Computational Linguistics
A maximum entropy approach to natural language processing

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Decoding complexity in word-replacement translation models

Computational Linguistics
A DP based search using monotone alignments in statistical translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Example-Based Machine Translation in the Pangloss system

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
The Candide system for machine translation

HLT '94 Proceedings of the workshop on Human Language Technology
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Fast sequential decoding algorithm using a stack

IBM Journal of Research and Development

An algorithmic framework for the decoding problem in statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently statistical methods for natural language translation have become popular and found reasonable success. In this paper we describe an English-Hindi statistical machine translation system. Our machine translation system is based on IBM Models 1, 2, and 3. We present experimental results on an English-Hindi parallel corpus consisting of 150,000 sentence pairs. We propose two new algorithms for the transfer of fertility parameters from Model 2 to Model 3. Our algorithms have a worst case time complexity of O(m3) improving on the exponential time algorithm proposed in the classical paper on IBM Models. When the maximum fertility of a word is small, our algorithms are O(m2) and hence very efficient in practice.