Optimization of multi-domain queries on the web

  • Authors:
  • Daniele Braga;Stefano Ceri;Florian Daniel;Davide Martinenghi

  • Affiliations:
  • Politecnico di Milano, Piazza Leonardo da Vinci, Milano, Italy;Politecnico di Milano, Piazza Leonardo da Vinci, Milano, Italy;Politecnico di Milano, Piazza Leonardo da Vinci, Milano, Italy;Politecnico di Milano, Piazza Leonardo da Vinci, Milano, Italy

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Where can I attend an interesting database workshop close to a sunny beach? Who are the strongest experts on service computing based upon their recent publication record and accepted European projects? Can I spend an April weekend in a city served by a low-cost direct flight from Milano offering a Mahler's symphony? We regard the above queries as multi-domain queries, i.e., queries that can be answered by combining knowledge from two or more domains (such as: seaside locations, flights, publications, accepted projects, conference offerings, and so on). This information is available on the Web, but no general-purpose software system can accept the above queries nor compute the answer. At the most, dedicated systems support specific multi-domain compositions (e.g., Google-local locates information such as restaurants and hotels upon geographic maps). This paper presents an overall framework for multi-domain queries on the Web. We address the following problems: (a) expressing multi-domain queries with an abstract formalism, (b) separating the treatment of "search" services within the model, by highlighting their differences from "exact" Web services, (c) explaining how the same query can be mapped to multiple "query plans", i.e., a well-defined scheduling of service invocations, possibly in parallel, which complies with their access limitations and preserves the ranking order in which search services return results; (d) introducing cross-domain joins as first-class operation within plans; (e) evaluating the query plans against several cost metrics so as to choose the most promising one for execution. This framework adapts to a variety of application contexts, ranging from end-user-oriented mash-up scenarios up to complex application integration scenarios.