Précis: from unstructured keywords as queries to structured databases as answers

  • Authors:
  • Alkis Simitsis;Georgia Koutrika;Yannis Ioannidis

  • Affiliations:
  • National Technical University of Athens, Athens, Greece 15772 and IBM Almaden Research Center, San Jose, USA;University of Athens, Athens, Greece 15784 and Stanford University, Stanford, USA;University of Athens, Athens, Greece 15784

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Précis queries represent a novel way of accessing data, which combines ideas and techniques from the fields of databases and information retrieval. They are free-form, keyword-based, queries on top of relational databases that generate entire multi-relation databases, which are logical subsets of the original ones. A logical subset contains not only items directly related to the given query keywords but also items implicitly related to them in various ways, with the purpose of providing to the user much greater insight into the original data. In this paper, we lay the foundations for the concept of logical database subsets that are generated from précis queries under a generalized perspective that removes several restrictions of previous work. In particular, we extend the semantics of précis queries considering that they may contain multiple terms combined through the AND, OR, and NOT operators. On the basis of these extended semantics, we define the concept of a logical database subset, we identify the one that is most relevant to a given query, and we provide algorithms for its generation. Finally, we present an extensive set of experimental results that demonstrate the efficiency and benefits of our approach.