Template-based wrappers in the TSIMMIS system
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
Semistructured and structured data in the Web: going back and forth
ACM SIGMOD Record
Ariadne: a system for constructing mediators for Internet sources
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Wrapper Generation by Using XML-Based Domain Knowledge for Intelligent Information Extraction
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
An interface agent for wrapper-based information extraction
PRIMA'04 Proceedings of the 7th Pacific Rim international conference on Intelligent Agents and Multi-Agent Systems
Hi-index | 0.00 |
This paper proposes a shopping agent with a robust inductive learning method that automatically constructs wrappers for semi-structured online stores. Strong biases assumed in many existing systems are weakened so that the real stores with reasonably complex document structures can be handled. Our method treats a logical line as a basic unit, and recognizes the position and the structure of product descriptions by finding the most frequent pattern from the sequence of logical line information in output HTML pages. This method is capable of analyzing product descriptions that comprise multiple logical lines, and even those with extra or missing attributes. Experimental tests on over 60 sites show that it successfully constructs correct wrappers for most real stores.