A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
WebL - a programming language for the Web
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Automating Web navigation with the WebVCR
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Distributed and Parallel Databases
Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto
LPNMR '01 Proceedings of the 6th International Conference on Logic Programming and Nonmonotonic Reasoning
Semi-Automatic Wrapper Generation for Commercial Web Sources
Proceedings of the IFIP TC8 / WG8.1 Working Conference on Engineering Information Systems in the Internet Context
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Automatic information extraction from large websites
Journal of the ACM (JACM)
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Structured Data Extraction from the Web Based on Partial Tree Alignment
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.01 |
In order to let software programs access and use the information and services provided by web sources, wrapper programs must be built to provide a "machine-readable" view over them. Although research literature on web wrappers is vast, the problem of how to specify the internal logic of complex wrappers in a graphical and simple way remains mainly ignored. In this paper, we propose a new language for addressing this task. Our approach leverages on the existing work on intelligent web data extraction and automatic web navigation as building blocks, and uses a workflow-based approach to specify the wrapper control logic. The features included in the language have been decided from the results of a study of a wide range of real web automation applications from different business areas. In this paper, we also present the most salient results of the study.