Adapting text instead of the model: an open domain approach

Authors:
Gourab Kundu;Dan Roth
Affiliations:
University of Illinois at Urbana Champaign, Urbana, IL;University of Illinois at Urbana Champaign, Urbana, IL
Venue:
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Year:
2011

Citing 16
Cited 1

Class-based n-gram models of natural language

Computational Linguistics
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A unified architecture for natural language processing: deep neural networks with multitask learning

Proceedings of the 25th international conference on Machine learning
A global joint model for semantic role labeling

Computational Linguistics
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Hierarchical Bayesian domain adaptation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Combination strategies for semantic role labeling

Journal of Artificial Intelligence Research
Unsupervised argument identification for Semantic Role Labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Distributional representations for handling sparsity in supervised sequence-labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Automatic domain adaptation for parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Open-domain semantic role labeling by modeling word spans

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The necessity of combining adaptation methods

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A multi-domain web-based algorithm for POS tagging of unknown words

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Transforming standard Arabic to colloquial Arabic

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural language systems trained on labeled data from one domain do not perform well on other domains. Most adaptation algorithms proposed in the literature train a new model for the new domain using unlabeled data. However, it is time consuming to retrain big models or pipeline systems. Moreover, the domain of a new target sentence may not be known, and one may not have significant amount of unlabeled data for every new domain. To pursue the goal of an Open Domain NLP (train once, test anywhere), we propose ADUT (ADaptation Using label-preserving Transformation), an approach that avoids the need for retraining and does not require knowledge of the new domain, or any data from it. Our approach applies simple label-preserving transformations to the target text so that the transformed text is more similar to the training domain; it then applies the existing model on the transformed sentences and combines the predictions to produce the desired prediction on the target text. We instantiate ADUT for the case of Semantic Role Labeling (SRL) and show that it compares favorably with approaches that retrain their model on the target domain. Specifically, this "on the fly" adaptation approach yields 13% error reduction for a single parse system when adapting from the news wire text to fiction.