A Conceptual Model and Rule-Based Query Language for HTML
World Wide Web
Taking the OXPath down the deep web
Proceedings of the 14th International Conference on Extending Database Technology
The OXPath to success in the deep web
Proceedings of the 20th international conference companion on World wide web
Intelligent crawling of web applications for web archiving
Proceedings of the 21st international conference companion on World Wide Web
OXPath: A language for scalable data extraction, automation, and crawling on the deep web
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Abstract: With the recent popularity of the web, enormous amount of information is now available on line. Most web documents available over the web are in HTML format and are hierarchically structured in nature. How to query such web documents based on their internal hierarchical structure becomes more and more important. In this paper, we present a rule-based language called WebQL to support effective and flexible web queries. Unlike other web query languages, WebQL is a high level declarative query language with a logical semantics. It allows us to query web documents based on their internal hierarchical structures. It supports not only negation and recursion, but also query result restructuring in a natural way. We also describe the implementation of the system that supports the WebQL query language.