A layered architecture for querying dynamic Web content
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Information integration using logical views
Theoretical Computer Science - Special issue on the 6th International Conference on Database Theory—ICDT '97
Theory of answering queries using views
ACM SIGMOD Record
Answering queries using views: A survey
The VLDB Journal — The International Journal on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
iTrails: pay-as-you-go information integration in dataspaces
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
The Claremont report on database research
ACM SIGMOD Record
Web table taxonomy and formalization
ACM SIGMOD Record
Hi-index | 0.00 |
There has been a substantial increase in the volume of (semi) structured data on the Web. This opens new opportunities for exploring and querying these data that goes beyond the keyword-based queries traditionally used on the Web. But supporting queries over a very large number of apparently disconnected Web sources is challenging. In this paper we propose index methods that capture both the structure of the sources and connections between them. The indexes are designed for data that is represented as relations, such as HTML tables, and support queries with predicates. We show how associations between overlapping sources are discovered, captured in the indexes, and used to derive query rewritings that join multiple sources. We demonstrate, through an experimental evaluation, that our approach scales to a large number of sources.