Towards a theory of search queries

  • Authors:
  • George H. L. Fletcher;Jan Van Den Bussche;Dirk Van Gucht;Stijn Vansummeren

  • Affiliations:
  • Eindhoven University of Technology, The Netherlands;Hasselt University and Transnational University of Limburg, Belgium;Indiana University, Bloomington;Université Libre de Bruxelles, Belgium

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

The need to manage diverse information sources has triggered the rise of very loosely structured data models, known as dataspace models. Such information management systems must allow querying in simple ways, mostly by a form of searching. Motivated by these developments, we propose a theory of search queries in a general model of dataspaces. In this model, a dataspace is a collection of data objects, where each data object is a collection of data items. Basic search queries are expressed using filters on data items, following the basic model of Boolean search in information retrieval. We characterize semantically the class of queries that can be expressed by searching. We apply our theory to classical relational databases, where we connect search queries to the known class of fully generic queries, and to dataspaces where data items are formed by attribute-value pairs. We also extend our theory to a more powerful, associative form of searching, where one can ask for objects that are similar to objects satisfying given search conditions. Such associative search queries are shown to correspond to a very limited kind of joins. We show that the basic search language extended with associative search can exactly define the queries definable in a restricted fragment of the semijoin algebra working on an explicit relational representation of the dataspace.