Reconfigurable Web Wrapper Agents

  • Authors:
  • Chia-Hui Chang;Harianto Siek;Jiann-Jyh Lu;Chun-Nan Hsu;Jen-Jie Chiou

  • Affiliations:
  • National Central University, Taiwan;Institute of Information Science, Taiwan;Institute of Information Science, Taiwan;Institute of Information Science, Taiwan;Deepspot Intelligent Systems, Taiwan

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The DeepSpot Agent Toolbox exploits online Web data sources using reconfigurable Web wrapper agents. These agents are rapidly generated and executed on the basis of the XML-based Web Navigation Description Language and extraction rule generator IEPAD (information extraction based on pattern discovery). A WNDL script describes how to locate, extract, and combine data. By executing different WNDL scripts, users can automate all types of Web browsing sessions. They also describe IEPAD, a data extractor based on pattern discovery techniques. IEPAD lets software agents automatically discover the extraction rules to extract the contents of a structurally formatted Web page without the need to label a Web page to train a wrapper. With this programming-by-example authoring tool, users can generate a complete Web wrapper agent by browsing the target Web sites. Various applications demonstrate this approach's feasibility.