Domain adaptation of information extraction models

  • Authors:
  • Rahul Gupta;Sunita Sarawagi

  • Affiliations:
  • IIT Bombay;IIT Bombay

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Domain adaptation refers to the process of adapting an extraction model trained in one domain to another related domain with only unlabeled data. We present a brief survey of existing methods of retraining models to best exploit labeled data from a related domain. These approaches that involve expensive model retraining are not practical when a large number of new domains have to be handled in an operational setting. We describe our approach for adapting record extraction models that exploits the regularity within a domain to jointly label records without retraining any model.