Proactive learning for building machine translation systems for minority languages

Authors:
Vamshi Ambati;Jaime Carbonell
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Year:
2009

Citing 9
Cited 0

Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Experiments with a Hindi-to-English transfer-based MT system under a miserly data scenario

ACM Transactions on Asian Language Information Processing (TALIP)
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Sample Selection for Statistical Parsing

Computational Linguistics
Active learning for HPSG parse selection

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Proactive learning: cost-sensitive active learning with multiple imperfect oracles

Proceedings of the 17th ACM conference on Information and knowledge management
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Building machine translation (MT) for many minority languages in the world is a serious challenge. For many minor languages there is little machine readable text, few knowledgeable linguists, and little money available for MT development. For these reasons, it becomes very important for an MT system to make best use of its resources, both labeled and unlabeled, in building a quality system. In this paper we argue that traditional active learning setup may not be the right fit for seeking annotations required for building a Syntax Based MT system for minority languages. We posit that a relatively new variant of active learning, Proactive Learning, is more suitable for this task.