Semistructured and structured data in the Web: going back and forth
ACM SIGMOD Record
WebL - a programming language for the Web
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Clean up your Web pages with HP's HTML tidy
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Service Combinators for Web Computing
IEEE Transactions on Software Engineering
Effective Web data extraction with standard XML technologies
Proceedings of the 10th international conference on World Wide Web
Hierarchical Wrapper Induction for Semistructured Information Sources
Autonomous Agents and Multi-Agent Systems
Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Proceedings of the 27th International Conference on Very Large Data Bases
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Hi-index | 0.00 |
Tasks on the web are performed world-wide for many different purposes (banking, shopping, auctions, e-mail, hotel reservations, flight booking, etc.). Up to now, using typical HTML-based web browsers for the web required users to mechanically and continually interact with computer screen-view of remotely retrieved documents (clicking on links or buttons, filling and submitting forms, screen-scrolling, visually finding data on the screen, to name a few). When the amount of data within those documents is large, this manual navigation easily becomes cost and effort overwhelming, even for the simplest tasks. Developing ad-hoc wrapper agents that automate these tasks for the user, by intelligently integrating semistructured web's data from heterogeneous sources, may considerably reduce these interactivity and effort requirements. Bargain finders or price comparers, among others, might present only final valuable results to the users, considerably reducing navigation effort. However, ad-hoc wrapper agents have traditionally had large development and maintenance costs. Due to the semistructured nature of HTML, any minor unexpected change often makes them not work properly. This paper presents several standards-based new techniques for reducing these development and maintenance costs and making these programs more compact and stable.