Exploring representation-learning approaches to domain adaptation

Authors:
Fei Huang;Alexander Yates
Affiliations:
Temple University, Philadelphia, PA;Temple University, Philadelphia, PA
Venue:
DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Year:
2010

Citing 16
Cited 2

The cascade-correlation learning architecture

Advances in neural information processing systems 2
Factorial Hidden Markov Models

Machine Learning - Special issue on learning with probabilistic representations
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A neural probabilistic language model

The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Domain-specific language models and lexicons for tagging

Journal of Biomedical Informatics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Part of speech tagging in context

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Three new graphical models for statistical language modelling

Proceedings of the 24th international conference on Machine learning
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
MAP adaptation of stochastic grammars

Computer Speech and Language
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
Distributional representations for handling sparsity in supervised sequence-labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1

Language models as representations for weakly-supervised NLP tasks

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Biased representation learning for domain adaptation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Sequence labeling systems like part-of-speech taggers are typically trained on newswire text, and in tests their error rate on, for example, biomedical data can triple, or worse. We investigate techniques for building open-domain sequence labeling systems that approach the ideal of a system whose accuracy is high and constant across domains. In particular, we investigate unsupervised techniques for representation learning that provide new features which are stable across domains, in that they are predictive in both the training and out-of-domain test data. In experiments, our novel techniques reduce error by as much as 29% relative to the previous state of the art on out-of-domain text.