Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
An unsupervised method for word sense tagging using parallel corpora
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
The multilingual entity task (MET) overview
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Grounding spatial named entities for information extraction and question answering
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
A high-performance semi-supervised learning method for text chunking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Improving machine translation quality with automatic named entity recognition
EAMT '03 Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools: Resources and Tools for Building MT
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Phrase clustering for discriminative learning
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Word representations: a simple and general method for semi-supervised learning
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Phylogenetic grammar induction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
Frustratingly easy semi-supervised domain adaptation
DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Unsupervised part-of-speech tagging with bilingual graph-based projections
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Local and global algorithms for disambiguation to Wikipedia
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised structure prediction with non-parallel multilingual guidance
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multi-source transfer of delexicalized dependency parsers
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Training dependency parsers by jointly optimizing multiple objectives
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-lingual word clusters for direct transfer of linguistic structure
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
In this paper, we study direct transfer methods for multilingual named entity recognition. Specifically, we extend the method recently proposed by Täckström et al. (2012), which is based on cross-lingual word cluster features. First, we show that by using multiple source languages, combined with self-training for target language adaptation, we can achieve significant improvements compared to using only single source direct transfer. Second, we investigate how the direct transfer system fares against a supervised target language system and conclude that between 8,000 and 16,000 word tokens need to be annotated in each target language to match the best direct transfer system. Finally, we show that we can significantly improve target language performance, even after annotating up to 64,000 tokens in the target language, by simply concatenating source and target language annotations.