Indexing relations on the web

  • Authors:
  • Sergio Luis Sardi Mergen;Juliana Freire;Carlos Alberto Heuser

  • Affiliations:
  • Universidade Federal do Rio Grande do Sul(UFRGS), Porto Alegre, RS - Brasil;School of Computing--University of Utah, Salt Lake City;Universidade Federal do Rio Grande do Sul(UFRGS), Porto Alegre, RS - Brasil

  • Venue:
  • Proceedings of the 13th International Conference on Extending Database Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

There has been a substantial increase in the volume of (semi) structured data on the Web. This opens new opportunities for exploring and querying these data that goes beyond the keyword-based queries traditionally used on the Web. But supporting queries over a very large number of apparently disconnected Web sources is challenging. In this paper we propose index methods that capture both the structure of the sources and connections between them. The indexes are designed for data that is represented as relations, such as HTML tables, and support queries with predicates. We show how associations between overlapping sources are discovered, captured in the indexes, and used to derive query rewritings that join multiple sources. We demonstrate, through an experimental evaluation, that our approach scales to a large number of sources.