The cascade-correlation learning architecture
Advances in neural information processing systems 2
Factorial Hidden Markov Models
Machine Learning - Special issue on learning with probabilistic representations
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A neural probabilistic language model
The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Domain-specific language models and lexicons for tagging
Journal of Biomedical Informatics
Contrastive estimation: training log-linear models on unlabeled data
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Part of speech tagging in context
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Three new graphical models for statistical language modelling
Proceedings of the 24th international conference on Machine learning
Domain adaptation with structural correspondence learning
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Domain adaptation for statistical classifiers
Journal of Artificial Intelligence Research
MAP adaptation of stochastic grammars
Computer Speech and Language
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Distributional representations for handling sparsity in supervised sequence-labeling
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Language models as representations for weakly-supervised NLP tasks
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Biased representation learning for domain adaptation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Sequence labeling systems like part-of-speech taggers are typically trained on newswire text, and in tests their error rate on, for example, biomedical data can triple, or worse. We investigate techniques for building open-domain sequence labeling systems that approach the ideal of a system whose accuracy is high and constant across domains. In particular, we investigate unsupervised techniques for representation learning that provide new features which are stable across domains, in that they are predictive in both the training and out-of-domain test data. In experiments, our novel techniques reduce error by as much as 29% relative to the previous state of the art on out-of-domain text.