An intelligent Web agent that autonomously learns how to translate

Authors:
Marco Turchi;Tijl De Bie;Nello Cristianini
Affiliations:
European Commission --JRC IPSC Via E. Fermi, 2749 I-21027 Ispra VA, Italy. E-mail: marco.turchi@jrc.ec.europa.eu;Intelligent Systems Laboratory, Merchant Venturers Building, University of Bristol, Woodland Road, Bristol, BS8 1UB, UK. E-mail: Tijl.DeBie@bristol.ac.uk, nello@support-vector.net;Intelligent Systems Laboratory, Merchant Venturers Building, University of Bristol, Woodland Road, Bristol, BS8 1UB, UK. E-mail: Tijl.DeBie@bristol.ac.uk, nello@support-vector.net
Venue:
Web Intelligence and Agent Systems
Year:
2012

Citing 30
Cited 0

Learning to Play Chess Using Temporal Differences

Machine Learning
Modern Information Retrieval

Modern Information Retrieval
Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
A systematic comparison of various statistical alignment models

Computational Linguistics
Learning dependency translation models as collections of finite-state head transducers

Computational Linguistics - Special issue on finite-state methods in NLP
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Lucene in Action (In Action series)

Lucene in Action (In Action series)
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora

Computational Linguistics
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Building Search Applications: Lucene, Lingpipe, and Gate

Building Search Applications: Lucene, Lingpipe, and Gate
Semi-supervised model adaptation for statistical machine translation

Machine Translation
Using the Web as corpus for self-training text categorization

Information Retrieval
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
On the use of comparable corpora to improve SMT performance

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Found in Translation

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Experiments in domain adaptation for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Learning performance of a machine translation system: a statistical and computational analysis

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Findings of the 2009 workshop on statistical machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Domain adaptation for statistical machine translation with monolingual resources

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
An Intelligent Agent That Autonomously Learns How to Translate

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Statistical Machine Translation

Statistical Machine Translation
Learning to translate: a statistical and computational analysis

Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the design of an autonomous agent that can teach itself how to translate from a foreign language, by first assembling its own training set, then using it to improve its vocabulary and language model. The key idea is that a Statistical Machine Translation package can be used for the Cross-Language Retrieval Task of assembling a training set from a vast amount of available text e.g. a large multilingual corpus, or the Web and then train on that data, repeating the process several times. The stability issues related to such a feedback loop are addressed by a mathematical model, connecting statistical and control-theoretic aspects of the system. We test it on controlled environment and real-world tasks, showing that indeed this agent can improve its translation performance autonomously and in a stable fashion, when seeded with a very small initial training set. We develop a multiprocessor version of the agent that directly accesses the Web using a Web search engine and taking advantage of the big amount of data available there. The modelling approach we develop for this agent is general, and we believe that it will be useful for an entire class of self-learning autonomous agents working on the Web.