Machine Learning - Special issue on inductive transfer
Active Hidden Markov Models for Information Extraction
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
A systematic comparison of various statistical alignment models
Computational Linguistics
Computational Linguistics - Special issue on web as corpus
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Economical active feature-value acquisition through Expected Utility estimation
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Extracting parallel sub-sentential fragments from non-parallel corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Active learning by labeling features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
IEEE Transactions on Knowledge and Data Engineering
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Crisis MT: developing a cookbook for MT in crisis situations
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Hi-index | 0.00 |
Supervised learning algorithms for identifying comparable sentence pairs from a dominantly non-parallel corpora require resources for computing feature functions as well as training the classifier. In this paper we propose active learning techniques for addressing the problem of building comparable data for low-resource languages. In particular we propose strategies to elicit two kinds of annotations from comparable sentence pairs: class label assignment and parallel segment extraction. We also propose an active learning strategy for these two annotations that performs significantly better than when sampling for either of the annotations independently.