Multilevel nested relational structures
Journal of Computer and System Sciences
A relational algebra for complex objects based on partial information
MFDBS 91 Proceedings of the 3rd symposium on Mathematical fundamentals of database and knowledge base systems
A query language and optimization techniques for unstructured data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Inferring structure in semistructured data
ACM SIGMOD Record
Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Interaction between path and type constraints
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Extracting semi-structured data through examples
Proceedings of the eighth international conference on Information and knowledge management
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Remarks on the algebra of non first normal form relations
PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Object Exchange Across Heterogeneous Information Sources
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Query by Example for Nested Tables
DEXA '98 Proceedings of the 9th International Conference on Database and Expert Systems Applications
Managing Web Data through Views
EC-Web 2001 Proceedings of the Second International Conference on Electronic Commerce and Web Technologies
An Example-Based Environment for Wrapper Generation
ER '00 Proceedings of the Workshops on Conceptual Modeling Approaches for E-Business and The World Wide Web and Conceptual Modeling: Conceptual Modeling for E-Business and the Web
Toolkits for Generating Wrappers
NODe '02 Revised Papers from the International Conference NetObjectDays on Objects, Components, Architectures, Services, and Applications for a Networked World
Hi-index | 0.00 |
The popularization of the Web has made a huge volume of data available for a large audience. In a large number of Web sites, such as bookstores, electronic catalogs, travel agencies, etc., the pages constitute documents which are composed of pieces of data whose overall structure can be easily recognized. Such pages are called data-rich and can be seen as collections of complex objects. In this paper, we show how such objects can be represented by nested tables, which are simple, intuitive, and quite convenient for expressing their implicit structure. The assumption is that, for most sites of interest, only few examples are required to reveal the structure of the objects. To corroborate our assumption, we describe a data extraction tool that adopts this approach and present results of some experiments carried out with this tool.