A query language for a Web-site management system
ACM SIGMOD Record
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
Modeling Web sources for information integration
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Database--Principles, Programming and Performance
Database--Principles, Programming and Performance
A Conceptual Model and Rule-Based Query Language for HTML
World Wide Web
A Data Model for Semistructured Data with Partial and Inconsistent Information
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Towards semistructured data integration
Web-enabled systems integration
Semantic Metadata for the Integration of Web-based Data for Electronic Commerce
WECWIS '99 Proceedings of the International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems
Hi-index | 0.00 |
Most documents available over the web confirm to the HTML specification. They are intended to be human readable through a web browser and thus are constructed following some common conventions. Based on such common conventions, the Conceptual Model for HTML was proposed recently to automatically capture the hierarchical structure within web documents. However, certain key semantic information about the contents in the documents, which are obvious to human, are often omitted. As a result, web data processing, manipulation and integration are still quite difficult. In this paper, we discuss how to extend the Conceptual Model for HTML to capture the intended semantics of the HTML documents. We show that with the new constructs introduced, using an Intelligent Wrapper, and limited human interaction, semantics can be transferred from human into the Extended Conceptual Model so that further meaningful processing, manipulation and integration of web documents become possible.