An interface agent for wrapper-based information extraction

  • Authors:
  • Jaeyoung Yang;Tae-Hyung Kim;Joongmin Choi

  • Affiliations:
  • Openbase Inc., Seoul, Korea;Department of Computer Science and Engineering, Hanyang University, Ansan, Kyunggi-Do, Korea;Department of Computer Science and Engineering, Hanyang University, Ansan, Kyunggi-Do, Korea

  • Venue:
  • PRIMA'04 Proceedings of the 7th Pacific Rim international conference on Intelligent Agents and Multi-Agent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new method of building information extraction rules for Web documents by exploiting a user interface agent that combines the manual and automatic approaches of rule generation. We adopt the scheme of supervised learning in which the interface agent is designed to get information from the user regarding what to extract from a document and XML-based wrappers are generated according to these inputs. The interface agent is used not only to generate new extraction rules but also to modify and extend existing ones to enhance the precision and the recall measures of Web information extraction systems. We have done a series of experiments to test the system, and the results are very promising.